Information For The Programmers: 2010

Friday, December 17, 2010

Building SQL Statements with multiple values for the "IN" CLAUSE

You would think that SQL would be able to have a easy way where you could pass it a simple string, with some values separated by commas, and it could throw that right into a sql statement like:

SELECT *
FROM Clients
WHERE ClientID IN (1,5,6)

However, this just doesn't work in sql. In fact, there is a very complex solution, which involves creating a function and then passing in you values separated by spaces as a NVARCHAR.

The function looks like this

FUNCTION [dbo].[iter_intlist_to_tbl] (@list nvarchar(MAX))
   RETURNS @tbl TABLE (listpos int IDENTITY(1, 1) NOT NULL,
   number int NOT NULL) AS
BEGIN
   DECLARE @startpos int,
   @endpos int,
   @textpos int,
   @chunklen smallint,
   @str nvarchar(4000),
   @tmpstr nvarchar(4000),
   @leftover nvarchar(4000)

   SET @textpos = 1
   SET @leftover = ''
   WHILE @textpos <= datalength(@list) / 2
   BEGIN
   SET @chunklen = 4000 - datalength(@leftover) / 2
   SET @tmpstr = ltrim(@leftover +
   substring(@list, @textpos, @chunklen))
   SET @textpos = @textpos + @chunklen

   SET @startpos = 0
   SET @endpos = charindex(' ' COLLATE Slovenian_BIN2, @tmpstr)

   WHILE @endpos > 0
   BEGIN
   SET @str = substring(@tmpstr, @startpos + 1,
   @endpos - @startpos - 1)
   IF @str <> ''
   INSERT @tbl (number) VALUES(convert(int, @str))
   SET @startpos = @endpos
   SET @endpos = charindex(' ' COLLATE Slovenian_BIN2,
   @tmpstr, @startpos + 1)
   END

   SET @leftover = right(@tmpstr, datalength(@tmpstr) / 2 - @startpos)
   END

   IF ltrim(rtrim(@leftover)) <> ''
   INSERT @tbl (number) VALUES(convert(int, @leftover))

   RETURN
END

Pretty crazy huh. All this does is pretty much build a table with the unique values you throw into it, so you can join on those values with the column you are trying to get the data from, and pull out the rows where any of the values match any of the rows in that table.

You utilize this function like so:

PROCEDURE [dbo].[Client_Get_All_By_ClientIDs]

@clientIDs NVARCHAR(20)

AS

SELECT *
  FROM Client
  WHERE ClientID IN
  (SELECT i.number from iter_intlist_to_tbl(@ClientIDs) i)

Then you run it like this

EXEC [dbo].[Client_Get_All_By_ClientIDs] '2 4 7 190'

And there it is. That is the simple way to do a mutiple IN CLAUSE SQL statement. Remember that if you are building your values to input into the stored procedure, to separate them by spaces " ", and NOT commas ",".

Friday, December 10, 2010

Fiddler

I was having the toughest time trying to figure out why my http request get was coming back from http://www.networksolutions.com/whois/index.jsp when I was trying to do a whois lookup.

It turned out that I was doing the search correctly, but I needed to have a referrer page or the website knew that it was an automated website request. How did I figure all this out? Fiddler.

Fiddler is a program that monitors all of the http request and get traffic on your computer. It stores all of the information such as the headers sent, the cookies in each website request, and other things.

In order to get past the automatic robot checking software for most sites, all you need to do is make sure your url is formed properly. In order to do this, open up fiddler. You can download it here at http://www.fiddler2.com/fiddler2/. Once you open it up, you can clear out all the traffic by selecting all of the items in the first window, and then deleting them.

Open up a browser, go to the website where you want to make your specific request, and then clear out your fiddler list. Make your specific request, and then make your request. When you search google.com on the whois lookup at http://www.networksolutions.com, here is the information that fiddler shows me in the Raw tab.

GET http://www.networksolutions.com/whois-search/google.com HTTP/1.1
Host: www.networksolutions.com
Connection: keep-alive
Referer: http://www.networksolutions.com/whois/index.jsp
Cache-Control: max-age=0
Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/534.7 (KHTML, like Gecko) Chrome/7.0.517.41 Safari/534.7
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cookie: siteId=527-10; _csuid=X162083227e04de; randomstring=; vrsnsf=161b3bac4c69008e3c39e201757c; landing=P99C8S570N0B9A1D670E0000V101; cart="|time=Fri Dec 03 17:19:22 EST 2010|sessionId=e31843b13a3d9c28b992aaedf8e7|cookie=161b3bac4c69008e3c39e201757c|cart={[H(3y)<lavineflowers.biz][DOM_BIZ(5y)<lavineflowers.biz]}{[H(3y)<lavineflowers.com][DOM_COM(5y)<lavineflowers.com]}"; JSESSIONID=2dc0cc11539eb8a2f5c74ee4e538; JROUTE=GMxg; loginSelectorDestination=; __utmz=82970249.1292027830.9.6.utmgclid=CIOI6Nv64qUCFQGe7Qod4WH9VA|utmccn=(not%20set)|utmcmd=(not%20set)|utmctr=network%20solutions; RVTFN=1-877-357-7586; RVRF=nsgooglebrand-network_solutions{night}-exact-101; RVID=1514294; RVNS_SESSID=2dc0cc11539eb8a2f5c74ee4e538; vertigo=false; s_cc=true; __utmv=; __utma=82970249.1039984786.1290123989.1291786510.1292027830.9; __utmc=82970249; __utmb=82970249.3.10.1292027830; test=none; s_sq=netsolglobal%3D%2526pid%253Dnet%25257C%252520whois%25253Eindex.jsp%2526pidt%253D1%2526oid%253Dhttp%25253A%25252F%25252Fwww.networksolutions.com%25252Fimg%25252Fbuttons%25252Fsearch-blue.png%2526ot%253DIMAGE; currency=USD

Pretty cool huh.

You will notice that it shows the host site -- very important. Most important for this query, it tells the referrer page in the web request get itself. So if the live site is sending itself that data, it is probably important for the automated request to contain that information too.

This is done by setting the request. referrer field by doing something like:

request.Referer = referer;

Also, the fiddler contains the information stored in the cookie. While I didn't need to store the cookie information for this request, or for most requests, some requests will require that.

I ran into that problem when I was doing a whois lookup at domain tools.com.

I had to save the sessionId after I had logged on to the site, otherwise it would only allow a limited number of whois requests. I also had to save the other cookie information that apparently contained the id for the website I was looking up. I did this with the following:

1. First, I created a cookie container in my class called _cookieContainer.

   private CookieContainer _cookieContainer = new CookieContainer();

2. Then in my web get method I make sure to add any cookies coming back from that uri into my cookie container.

   response = (HttpWebResponse)request.GetResponse();

   response.Cookies = request.CookieContainer.GetCookies(request.RequestUri);

   _cookieContainer.Add(request.RequestUri, response.Cookies);

3. Whenever I submit a request make sure to set its cookie container to the one you got form the last web request.

   request.CookieContainer = _cookieContainer;

4. This is like a circular loop. It keeps storing the relevant cookie information, and sending that out with each subsequent request.

This will set the response cookies correctly, then on your next request

In order to add cookies manually, the following line of code works:

cookieContainer.Add(new Cookie("dtsession", "6bfcc307d2b0bf52f78597ba1f1e50da", "/", "domaintools.com"));
cookieContainer.Add(new Cookie("SessionToken", "8ea23c8f35383f75a0fa6c23356337e74f8bd7c4", "/", "domaintools.com"));

Happy web scraping.

Friday, December 3, 2010

Check Boxes in Rad Grids

So this week I had to update a rad grid on the 29Prime administration site for the sales people and customers of 29Prime. There was an inbox of messages displayed in a telerik rad grid, and we had the information displayed for each message that someone else in the company had sent you or that you had sent them. It was basically like a simplified email system on the website.

My task was to make this rad grid so that when you clicked on a checkbox that you had received or completed the task in the message, you could check some check boxes to say that. One check box said Accepted, and another one said Completed.

As the site was set up, the rad grid would automatically, using javascript, open up a window displaying the message for whatever row in the rad box you had selected.

For background, a telerik rad grid is a lot like a normal grid view, if you have ever used them. The difference is that rad grids are able to be given different styles to make them look cooler. You can select a premade template from a list in the designer view if you just look for it.

The problem was that having to open up a popup window for each message was a little tedious, and the page had to refresh itself every time you closed a popup window to track in the grid if that message was marked accepted or completed in the popup. My job was to make it so you could simply check a check box in the rad grid and it would update itself instantaneously. In order to do this I had to change the column where the checkboxes were now from

<telerik:GridBoundColumn

to

<--
<telerik:GridTemplateColumn HeaderText="Accepted">
   <ItemTemplate>
   <asp:CheckBox ID="AcceptedCheckBox" ValidationGroup='<%#Eval("AlertID") %>'
   runat="server" AutoPostBack="true" Checked='<%# Bind("Accepted") %>'
   OnCheckedChanged="AcceptedCheckBox_OnCheckedChanged" />
   </ItemTemplate>
</telerik:GridTemplateColumn>
-->

Then, from the code behind I used the validation group to tell the database which message, (or Alert, as the system called it) that I wanted to change. I could not find any property in an asp checkbox that would take a non-constant value. I tried to use skinID, and I tried to use other attributes of an asp check box, but either the system would give me an error that those attributes could not be changed when the rad grid was bound, or that when I used the ToolTip attribute, it would show up when the mouse hovered over the rad grid.

That could be a useful tip for debugging, to see which rows are showing up in your grid, but I could not let those kind of features go on a professional looking site. I settled with using the ValidationGroup attribute because that was the only thing that I could set on databind, and wouldn't show up as a tooltip on mouse hover.

The best part of the whole process is that the program updates the database really fast whenever the checkbox is changed. I type cast the sender as a checkbox, and then use the checkbox1.checked feature to set the database boolean field as to the value of the checkbox.

The page doesn't even have to refresh itself. It's like the database update is running on a different thread from the website view.

Friday, November 12, 2010

Hello World! :):):):)