Download this file:
This was my personal favorite visualization. It shows a three dimensional directed graph where each of the nodes depicts a different page, and the edges depict a hyperlink from one page to another. It should be noted that to create this graph the data could be retrieved by parsing an html file and looking for the <a href> tag. However, I didn't do it this way. As the intention was to use the same data source for all of our visualizations on this final project, I had to get a little creative. On each page hit in the log files there is a referer field. This field tells what page a user is coming from. Thus if the field is blank it means the user typed the page address in to the address bar of their browser manually. If they were coming from a search engine like google, then the referer will be a long and unreadable string specifying the query that was typed in to google to retrieve that link.
To retrieve a useable data set from the log files was actually the hardest of all the data mining operations needed for this final project. To retrieve this data I used Microsfts Import/Export tool that came with SQL Server 7. I exported the data from the database using an SQL query. This query grouped all the distinct page, referer pairs and output them to a file. This got rid of all duplicate links, thus drastically decreasing the data I had to slog through. After getting this shorter set of data I then was able to manually pull out just the pieces I wanted, and store them in a seperate file.
The end result of the visualization looks like this:
Note that every time you run it, it will look a little different. This is because the position and coloring of the nodes are random. For each edge it is the color of the source node.
--Begin Code --
The following strcut is used to hold the location and color of the sphere. It also stores the web page that the node is representing. This is necessary as when we start loading the edges, the source and destination of the line is found by retrieving the position of the spheres with the corresponding page name.
This function uses random_integer to randomly position the node, as well as randomly select the color for that node
void PositionSphere(NodeInfo *node);
This generates a random integer
in the range of lowest_number to highest_number
This loops through all the nodes
and returns the node that points to the page specified. If there is
no associated node then it returns -1