Tuesday, February 28, 2006

Driving the Speed Limit

You can thank Aaron for introducing me to this viral video. See what happens when a group of people decide to drive the posted speed limit on I-75 through Atlanta. Four lanes of the highway blocked by four cars all going 55 miles per hour. Pretty funny!

http://video.google.com/videoplay?docid=-5366552067462745475&q=speed

Nick Landry on DNR

On this week's .NET Rocks!, Nickolas Landry talks about Windows Mobile 5.0.

Say, isn't he the one who was trying to pick up the girl on Caesars 24/7 (reality show on A&E):

"... In Vegas, it's said that for every man there's a woman and for every woman there's a man. But does that include geeks? Four big time nerds in town for a computer programmer's convention decide to take a break from boring seminars and try to pick-up a woman they meet at the bar. Are they up to the challenge?"

Reference: http://weblogs.asp.net/cfranklin/archive/2005/01/19/355841.aspx

Monday, February 27, 2006

Upcoming Speaking Engagements

Greg Huber and I have a few more Media Center Presentation's lined up:

Feb 28, Northwest Ohio .NET User Group - http://www.nwnug.com

March 23, Central Ohio .NET Developers Group- http://www.condg.org/default.aspx

Abstract:
Will this be the year of the Home Theater PC (HTPC)? Come learn about Windows Media Center 2005 and decide for yourself. Topics covered include: Overview of Media Center, Alternatives to Media Center, The 10-foot Experience, Developing for Media Center, and using the Xbox 360 as an Extender.

Come check 'em out if you get a chance!

Disclaimer: Blatently plagerized from Greg's post!

I'm really digging this tag-team presentation style. It lets one person have a little bit of downtime to mentally prepare for their next part while the other person does their thing.

Greg and I still do post-mortems and refine the presentation every time that we do it: we watch the audience and make note of where people were falling asleep, and then decide to either improve that section of the agenda, or drop it altogether. And, of course, we've made it part of the show to make an actual sacrifice to the Presentation Gods (hoping that this gesture somehow makes the gremlins go into hiding).

Four Things Meme

This meme has gotten pretty popular lately among the blogs that I follow, so I knew it was only time until I myself got tagged. Thanks, Jim!

Four Jobs I've Had...

  1. Bailing Hay and Straw
  2. McDonald's
  3. Sorting Freight
  4. Programmer

Four movies I can watch over and over...
  1. Back To The Future (trilogy)
  2. Office Space
  3. There's Something About Mary
  4. The Matrix

Four TV shows I love to watch...
  1. LOST
  2. Ghost Hunters
  3. Two and a Half Men
  4. My Name is Earl

Four places I've been on vacation...
  1. Cancun
  2. Turks and Caicos Islands
  3. Virginia Beach
  4. Smoky Mountains

Four favorite dishes...
  1. Lasagna
  2. Mashed Potatoes and Gravy (served with just about anything else)
  3. Steak (medium-rare)
  4. Any burrito at Qdoba

Four websites I visit daily...
  1. www.bloglines.com (aggregates almost all of my reading)
  2. www.spaceweather.com
  3. news.google.com
  4. statcounter.com (Narcissism: I regularly check who reads my blog)

Four places I'd rather be...
  1. On vacation somewhere that it is warm
  2. Hiking
  3. Riding a motorcycle with no particular destination
  4. Sitting at the swim-up bar in a pool at the Moon Palace in Cancun

Four bloggers I'm tagging...
  1. Aaron
  2. Greg
  3. Carl
  4. Rory

resizeable != resizable

Here's a tip that might save you 30 minutes of debugging a silly problem: Resizable is spelled R-E-S-I-Z-A-B-L-E, not "resizeable".

When using in conjunction with window.open(), it seems that this feature name must be spelled correctly in order to work. ;-)

Hating Mark Miller About Now

Just wanted to add to the cliche: CodeRush rocks! And Refactor! Pro is not too shabby, either.

Now, I'm hating Mark Miller (the Architect of CodeRush) because I have to think of a way to come up with the funds to buy these products outright (or get my employer to do it for me, which I would greatly prefer!).

BTW: Inspired by the dnrTV episode, I wrote a CodeRush plugin myself. The first iteration was quite simple, but when I wanted to get into some language-independent code generation, I quickly got lost in the DXCore API. However, Mr. Miller was extremely helpful in pointing me in the right direction. Great product, great personal support--these guys are awesome players in a neat little niche. Plus, they're sponsors of .NET Rocks, which carries a lot of weight when it comes to me selecting/recommending products.

Saturday, February 25, 2006

Xbox 360 Extender: My "Session Terminated" Cause Identified

Before Greg Huber and I presented on Media Center for GANG, I had built a MCE box that was clean and contained everything that we needed for the presentation. Before (at the Dayton-Cincinnati Code Camp), we had his MCE PC, his laptop, and my laptop, and experienced all kinds of technical issues trying to network everything together for the presentation.

The box worked great, and was all set to go. Well, that was until I tried to connect my XBox 360 to it (which was needed for part of the presentation).

I ran through the PCSETUP program, and everything seemed fine. But, when the XBox attempted to connect to the Media Center session, I would always get a "Session Terminated" error. I tried for hours that day to get the connection to work, but everything failed. Then, when I tried to systematically remove one installed program at a time (hoping that it was my DVD software, or something else that was 3rd party), I ended up hosing the box big-time.

So, we ended up reverting back to Greg's MCE machine and both of our laptops... :(

I recently paved that machine and reinstalled everything. But, wouldn't you know it, the "Session Terminated" problem was still there! However, this time, I had the opportunity to dig deeper into the problem, since I wasn't pushing up against the start time of a presentation.

I was actually able to determine that a GeForce FX5200 was at fault. In order to promote discussion, I detailed the situation on The Green Button:

http://www.thegreenbutton.com/community/shwmessage.aspx?ForumID=58&MessageID=157807

Friday, February 24, 2006

The Code Room Episode 3 Is Out

The third installment of The Code Room is now online:

http://msdn.microsoft.com/msdntv/episode.aspx?xml=episodes/en/20060223CodeRoom3/manifest.xml

This episode is obviously scripted, whereas the previous two episodes were "reality television". Nonetheless, there is still a very important message presented: developers must consider security during the entire development process, and once a system is compromised, you cannot just patch the original holes because the hackers probably have a collection of backdoors.

(The first Code room "starred" Chris Menegay, Scott Bellware, and Tracy Sawyer).

Thursday, February 23, 2006

Good Week For Observing ISS in North America

Ever get a chance to see the International Space Station (ISS) with your own eyes? Its orbit will provide for a series of highly favorable passes over North America over the course of the next week or so.

To be favorable for naked-eye observation, you must be in the dark while the ISS, flying overhead, is illuminated by the sun (so, this happens either hours after sunset or hours before sunrise).

Use the Heavens-Above web site to help figure out when the ISS is visible from your location, and where to look in the sky. Be sure to synchronize your clock with the official time (important, since the window is only a few minutes).

When it's flying over, you pretty much can't miss it. It will be comparable to viewing Venus or an airplane with its landing lights on, but will slowly move across the sky until it crosses into the day/night terminator, at which time, it will just disappear.

Heavens-Above will also calculate for you other satellites that you can observe in pretty much the same way. The best viewing that I can remember was one year when the ISS and Space Shuttle flew overhead shortly after the Shuttle had detached from the space station. So, instead of one point of light, I saw two in close proximity to one another.

Windows Vista SKU Featureset Revealed

I hope that KenLin had the authority to reveal this info at this time:

http://msmvps.com/blogs/kenlin/archive/2006/02/23/84582.aspx

Since I've been presenting on Media Center lately, I immediately scanned the features matrix to see if it was going to be available across all of Vista's SKUs. Bottom line: Nope! Only the Windows Vista Home Premium and Windows Vista Ultimate SKUs will include the Media Center functionality.

Wednesday, February 22, 2006

MSDN on DVD

I received my first MSDN shipment yesterday from the subscription that I was awarded for being a Finalist in the Connected Systems Developer Competition.

I've been in charge of receiving the MSDN subscription media for my branch, which we receive as being part of the Microsoft Certified Partner program. These always came on CD, because all of our servers have CD-ROM drives (not a one has a DVD drive).

But, my personal subscription is on DVD. I'm amazed at how many products are crammed onto each disk, and by how much smaller the shipments are when compared to the number of CD's that we would normally receive at a time.

This comes as a relief to me, because I don't know what I would do with hundreds of CDs in my house (it's going to be bad enough keeping track of scores of DVDs!)

Mug Shot

The observant ones among you regular readers might notice a new addition to the sidebar...

Tuesday, February 21, 2006

Miniature Windmill Generator

This should come as no surprise to anyone, but somebody came up with a miniature windmill generator that looks like a small electric fan. At peak output, it appears that it can do 1 amp with a potential difference of 12 volts, which could be used to recharge a cell phone.

Big deal? Yeah. But, it interested me because of my thoughts a while ago of using arrays of micro-generators, like these, in order to produce significant levels of power (maybe enough for your house):

http://jasonf-blog.blogspot.com/2005/10/pressure-generator-array.html

One caveat, though: The huge electric generating windmills with the 30-foot blades continue to spin after the wind has stopped. This is because they store energy as angular momentum, like a flywheel. So, that benefit would not exist with the array of micro-generators (essentially, when the wind stops, so does your electric generation).

Monday, February 20, 2006

Burned by Floating Point Numbers

For those following along at home, you know that I've been working over the past few weeks with a big SQL Server database on a relatively small partition (until the new hardware arrives). Because of this, I've been trying to keep space consumption/database growth under strict control. When you're talking about millions of rows of data in a table, even 1 byte difference in the size of one numeric field translates to millions of bytes of extra disk space.

Well, one choice that I made was to represent decimal numbers using the "real" data type. This is a floating point decimal that uses 4 bytes per number. Other possible alternatives were all 8 bytes per number, so I essentially cut my storage in half (albeit, with some "acceptable" loss in precision).

While reconciling the results of my calculations to the set of control data, we came across a very good example of where this absolutely is the wrong choice of data type.

One field of the table contained Quantity information, and this was not always integer data (i.e., maybe it was weight). Well, my "real" numbers had no problems representing this data, for the most part. But, there was one point of sale where all of the sales were reversed (so that every record with a positive Quantity value would have a corresponding record with a negative Quantity value). The net quantity for this location was zero.

But, upon examination, my net quantity was coming out to be something like 0.000007472 due to errors introduced by the floating point math. Yeah, it's close to zero, but, it led to a serious flaw in my calculations.

You see, I was already trapping for Divide by Zero situations using a CASE clause:

SELECT CASE 
WHEN SumQty <> 0
THEN (Qty/SumQty)* ValToSpread
ELSE 0
END

If it weren't for the floating point math, then this particular SumQty value would have been zero, and the result of the calculation would have been zero. But, in this case, the floating point's "zero" was a small fraction--not zero.

A tangent: Do you know why Divide by Zero is illegal? It's because the Y-axis (x=0) is an asymptote for the hyperbola y = a / x. This means that as the number that you're dividing by gets closer and closer to zero (i.e., a fraction approaching zero) that the quotient gets larger and larger (approaches infinity). So, dividing by zero is undefined because it's actually +/- infinity at the same time.

What's the point of this mathematical mumble-jumble? Well, the sum of my quantity was not zero, but rather a small fraction: 0.000007472. The result of dividing by this number turned out to be a very big number, when the true result that was needed in order to reconcile with the control data should have been zero. So, my results were way off of where they should have been (to the tune of millions of dollars). Whoops! Good thing this was just in QA!

To resolve this one case, I changed the quantity field's data type to a "decimal", which is a fixed-precision number (I can specify how many total digits and decimal digits are maintained). I still went from a 4-byte number to a 8-byte number, but the result of the calculation was dead on.

So, the lesson learned: sometimes it's quite necessary to make the trade-off of space and speed provided by floating point numbers for the accuracy of fixed precision.

Day of .NET

Here's an exciting announcement:

What: Day of .NET (1-day Conference)
When: May 13, 2006 9:00 AM - 5:15 PM
Where: Washtenaw Community College, Ann Arbor, MI

Website: http://dayofdotnet.org/

A cooperative effort between:

Sunday, February 19, 2006

Breaking Change in No-Touch Deployment (2.0 Framework)

A peer of mine (Murph) brought this breaking change to my attention:

In prior version of the .NET Framework (v1.0 and v1.1), you had an option to distribute executables by hosting all of the files (EXE, config, and other necessary assemblies) on a web server, and then opening the EXE file by means of a URL. This is actually a precursor to ClickOnce, and was referred to as No-Touch Deployment (or HREF EXEs).

Well, it seems that by simply installing the 2.0 Framework onto a machine, you break the ability to continue using HREF EXEs.... when the server is in the Internet zone. The web server will log the GET request for the EXE, but you will not see any requests logged for the config file, and the user will get an OPEN or SAVE AS dialog, which in turn does absolutely nothing.

Credit to another member of my organization (Jones) for pointing out the following MSDN Product Feedback record, which discusses that this is by design:

http://lab.msdn.microsoft.com/ProductFeedback/viewfeedback.aspx?feedbackid=ef4ae9a2-1d40-4241-aca4-61d579929793

The workaround is to get the site out of the Internet zone. If it's a trusted server, then you can trust the site, and that will change the zone. Otherwise, you're left creating a new CAS policy. Either way, this does not sound like fun if the application was deployed to hundreds of client machines using HREF EXEs.

Saturday, February 18, 2006

Spot the Bug

Here is a completely made up query, but is close to one that I wrote recently to spread an overhead cost value (@MarketingExpenses) over a list of invoices. This is just one step in a complex series of calculations that helps to determine profitability of products or services sold.

SELECT 
CASE
WHEN COALESCE(b.sum_of_sales,0) > 0
THEN (a.sales / b.sum_of_sales) * @MarketingExpenses
ELSE 0
END AS ShareOfMarketingExpenses
FROM
invoice a LEFT JOIN regional_sum b
ON a.region_code = b.region_code


This query assumes that there's a table called [regional_sum] that has the sum of sales for each region. The [invoice] table is joined to [regional_sum] by linking the "region_code" fields of both tables, and since a LEFT JOIN is used, it will include all rows from [invoice] and only those rows in [regional_sum] that matches (if there is not a match, then NULL values will be used for the [regional_sum] data).

Now, the point of the COALESCE function is to return the first non-NULL value in the supplied list of values (b.sum_of_sales and zero). In this case, I'm simply trying to handle NULL values by changing them into zeros.

You'll also notice that I'm using a SQL CASE statement. I do this because I need to prevent the dreaded "divide by zero" error, so I ensure that the "sum_of_sales" field is not zero before trying to calculate a percentage (otherwise, I return zero as the result of the calculation).

The percentage is then multiplied by the predetermined "@MarketingExpenses" value in order to give you a number of marketing dollars that could be attributed to that one invoice record ("ShareOfMarketingExpenses").

Let me just assure you that the query above compiles and works as written, so the bug is not a syntax error. However, there was an issue that came up as we were trying to reconcile data from the new system with old/validated numbers (the sum of all of the "ShareOfMarketingExpenses" values did not equal the specified "@MarketingExpenses" value).

I'll leave it as an exercise to the reader, but I'll also provide some sample data:

============
REGIONAL_SUM
============

region_code sum_of_sales
----------- ------------
a0001 12345.67
a0002 432312.33
b0001 -123.23
b0004 34322.23
b0005 -12.12
c0006 234328.23

(Negative sales are probably due to internal accounting of moving inventory around from one region to another, but I'm not that close to this [fictitious] business, so I can't explain the numbers any better).

=======
INVOICE
=======

customer_id region_code sales
----------- ----------- ----------
1 a0001 12.32
1 a0001 33.23
2 b0001 5.34
3 c0006 1232.21
3 b0004 322.22
4 a0003 232.23
[the list goes on, but pretend that the
sales for each region correctly sums up
to the values listed in the above table]

It was a pretty cool "Ah ha!" moment for me when I spotted the flaw. Your mileage may vary.

Thursday, February 16, 2006

Is Google Preventing Analysis of Their Search Results?

Robert Scoble has been doing an experiment over the past few days where he asks his readers to include a simple non-existent word in their blogs in order to create a new meme: brrreeeport.

Some initial findings can be found here:

http://scobleizer.wordpress.com/2006/02/15/brrreeeport-crazy-and-more-search-engine-lies/

Now I don't know the full story, but Google is apparently no longer serving up any results for the term "brrreeeport". Is the DNE (Do No Evil) Gang worried about Robert's findings surrounding the accuracy of their reported search results? (In other words, it appears that Scoble's experiment is calling them out on the fact that they inflate the reported number of search matches)

Where did all of the brrreeeport matches go?

UPDATE Feb 17, 2006 8:00AM EST: I checked this morning, and the 22,700 results are back online from the main Google web search. But, since you can only have a maximum of 1000 results returned for any given query, how can anyone verify that there are indeed 27,000 real matches for "brrreeeport" in the Google database?

They're Back!

Tuesday, February 14, 2006

Special Appearance, 1 Night Only!

Greg and I will be presenting at the Great Lakes Area .NET User Group (GANG) meeting on Wednesday, February 15th at 6PM (Microsoft Southfield office (in the Fifth-Third tower near Evergreen and 10-mile/Lodge)).

http://www.migang.org/Default.aspx?tabid=22

The topic will be Windows XP Media Center Edition 2005. The first part will include a general introduction to MCE (what, why, and how). The second part will include some examples of how to develop for this platform. And, we're sure to whip out my XBox 360 at one point or another.

There is a bit of irony in me doing this presentation on Wednesday: I'll be bringing my HTPC machine with me for the demo, so there will be no PVR at my home to record LOST. :)

BlogCode looks cool

Scoble blogged about this: If you have a blog that you like, BlogCode will let you find other blogs that are similar in nature. It goes well beyond just keyword matching.

brrreeeport!

Monday, February 13, 2006

Future of Touchscreens

You MUST watch this if you haven't already:

http://www.youtube.com/w/Crazy-Multi-Input-Touch-Screen?v=zp-y3ZNaCqs&search=crazy%20multi-input%20touch%20screen

UPDATE: Richard covered this on this week's Mondays!

Sunday, February 12, 2006

Cool Video

HV Switch Opening at a power substation

Friday, February 10, 2006

Overlooked Database Space Optimization

While looking for places to cut down the size of this huge database that I'm working with, I realized that all strings were being stored in nvarchar fields, which uses Unicode, or 2-bytes per character. This is likely because the database was originally prototyped in MS-Access, and when they imported all of the Access data into SQL Server 2000 using DTS, it transformed the Text fields into nvarchar fields.

Well, it turns out that all of the string data is ASCII anyways, so I was able to nearly cut in half the storage requirements of the database by changing the nvarchar fields into regular varchar fields. This also had a positive affect on the runtimes of my stored procedures that moves data around within the database.

Thursday, February 09, 2006

Finally Saw the Light

After all of the wondering I did about the value of the new DELETE TOP(x) in SQL Server 2005 (here), I finally saw the light today as I worked on a SQL Server 2000 project.

Picture this:

  • A huge-ass table with 2.5 million records taking up about 5.5 GB of disk space.
  • Most, but not necessarily all, of this data needs to be replaced monthly with a new snapshot that will be generated via an ETL (Extract, Transform, Load) process.
  • Due to disk space, the database has a hard limit of 12 GB, and the transaction log is also size fixed.


The data is marked with a month and year, so replacing data consists of deleting existing records for the months/years that are being imported, and then copying the new records into the table. But, it's not that simple.

If you just remove all of the records in one batch using DELETE FROM table WHERE..., then you need to have enough room in the transaction log to write the "undo" information in case the batch fails (so it can roll back the entire DELETE). If the transaction log cannot accomodate the 5.5GB of data that will be stuffed into it, then the batch will fail.

So, you need to find a way to break up the deletes into smaller chunks. Each smaller DELETE will still use the Transaction log, but after the command successfully terminates, that space will be reclaimed for subsequent operations.

Under SQL 2000, you're stuck with things like using Cursors in order to iterate through a list of month/year combinations from the incoming data (assuming that it was saved to an intermediate table first), and then using those values as part of a DELETE statement within the loop.

But, with our new friend SQL Server 2005, you can use something like multiple "DELETE TOP(1000) FROM table WHERE..." in order to break up the transaction.

Now That's Product Placement!

On last night's LOST, Hurley was reading a manuscript for a mystery novel that was presumably written by one of the people aboard Flight 815 who did not survive.

Well, looky what's on Amazon (read the "About the Author" in the Editorial Review):

http://www.amazon.com/gp/product/1401302769/sr=1-1/qid=1139457754/ref=pd_bbs_1/002-3377525-3860841?%5Fencoding=UTF8

So, LOST is not using product placement in the form of showing a character drinking Pepsi. It's introducing original works, and then marketing those same products to us consumers in the real world. Brilliant, IMHO!

Wednesday, February 08, 2006

Has Anybody Seen My Code?

About 6 years ago, I wrote some great code as part of an ASP site. There was a SQL Server stored procedure that ran for a long time, an because of this, it would periodically update a status table with percent complete (it was performing cursor-based operations that applied a set of rules to a set of data, so it could accurately report how far long in the set of rules it was at any one time). The stored proc was kicked off from ASP/ADO script, and that web page would keep refreshing itself in order to update the % complete.

I had a very similar need today: to kick off a stored procedure from an OSASPADO (Old-School ASP/ADO) site and load a status page that kept refreshing while the proc continued to execute on the server. You know what? I can't find my old code, and I couldn't find a way to recreate it!

In the world of classic ASP running on IIS 6 (Windows 2003 Server) using a SQL Server 2000 backend, connections appear to be recycled shortly after the script finishes executing. I tried probably 6-10 different ways of starting the stored procedure, but it would always stall out within a minute.

I finally gave up, and went with the out-of-band approach: The ASP script would insert some parameters into a "jobs" table, and then a SQL Server scheduled job (via the SQL Agent) would check this table, and execute the stored procedure using these parameter values if data existed in the "jobs" table.

I realize that this description is a little bit on the abstract side, but how the heck was I able to accomplish this task in the past on IIS 5/Windows 2000/SQL Server 7.0???

Thursday, February 02, 2006

@!#%& Quicktime

Every time that I pave my machine, I swear that I'm not going to load Quicktime again, but I eventually need to because I have yet to find a way to play Quicktime videos in Windows Media Player.

Despite explicitly telling the installer that I DO NOT WANT QUICKTIME TO BE MY DEFAULT PLAYER for anything but Quicktime videos, I always seem to see the plugin pop up when I click on, say an MP3 file link on a web page:

Quicktime Plugin Playing My Browser's MP3 Files

(the link being displayed in this screenshot, btw, is another SSWUG podcast where a blog post of mine was mentioned:

http://www.sswug.org/sswugradio/the_where_clause_02feb2006.mp3)

PwopWatcher

Note: I did some blog refactoring, and extracted this information into it's own post. It originally was here.

My little program is a console app that I scheduled to run every 3 hours using the Windows Task Scheduler. It looks to the config file for a list of feeds, and then fetches the RSS from those feeds, and searches for //item/enclosure[@url] elements in order to get the URL for each .torrent file that Pwop publishes. The .torrent file is then downloaded and saved to the drop directory that I configured µTorrent to watch. I mean, this is what everyone's looking for when they say that they want RSS capabilities in their BitTorrent client, right?

Oh, and I kind of take advantage of a little error-handling feature of µTorrent: if it already loaded a particular torrent, it will simply log a warning if you try to load that torrent again. This is useful because my console app blindly downloads torrents each time that it runs, but µTorrent just ignores the duplicates (only new torrents placed in the drop directory will be processed).

I call the program PwopWatcher, but only because I'm not too imaginative in naming projects, and chose this name because... well, because I use it to watch the Pwop RSS feeds (in other words, I don't mean to imply that it was produced by Pwop Productions).

It is a .NET 2.0 console application, and here's the source code (for Program.cs, the main class file that VS2005 creates for Console Application projects). I'm listing it here because I don't think this is rocket science, and it will hopefully benefit people looking for examples of processing RSS feeds and downloading files from the Internet using the .NET 2.0 Framework.


class Program
{
  static string dropDir = @"c:\pwop\drop\";

  static void Main(string[] args)
  {
    try
    {

      dropDir = System.Configuration.ConfigurationManager.AppSettings["dropDir"];

      if (!dropDir.EndsWith(@"\"))
      {
        dropDir += @"\";
      }

      foreach (string feed in System.Configuration.ConfigurationManager.AppSettings["feeds"].Split(";".ToCharArray()))
      {
        ParseTorrentsFromFeed(feed);
      }
    }
    catch (Exception ex)
    {
      Console.Beep();
      Console.WriteLine(ex.ToString());
    }

    System.Threading.Thread.Sleep(30000);
  }

  static void ParseTorrentsFromFeed(string feed)
  {
    if (feed == string.Empty || feed == null)
      return;

    System.Xml.XmlDocument xDoc = new System.Xml.XmlDocument();

    try
    {
      xDoc.Load(feed);
    }
    catch (System.Xml.XmlException ex)
    {
      Console.Beep();
      Console.WriteLine("XmlException encountered while loading feed:\n\n" + ex.ToString());
      return;
    }

    System.Xml.XmlNodeList nl = xDoc.SelectNodes("//item/enclosure[@url]");

    foreach (System.Xml.XmlNode n in nl)
    {
      DownloadTorrent(n.Attributes["url"].Value);
    }

  }

  static void DownloadTorrent(string url)
  {
    try
    {
      string filename = url.Substring(url.LastIndexOf("/") + 1);

      using (System.Net.WebClient wc = new System.Net.WebClient())
      {
        wc.DownloadFile(url, dropDir + filename);
        Console.WriteLine("Fetched " + filename);
      }
    }
    catch (Exception ex)
    {
      Console.Beep();
      Console.WriteLine("Exception encountered while downloading torrent:\n\n" + ex.ToString());
      return;
    }
  }
}


The config file is expected to have two appSettings:

"feeds" is a semicolon-delimited list of RSS feeds
"dropDir" is the directory where the .torrent files will be saved to

µTorrent r0x0rz!

I've been seeding torrents for the different Pwop shows for a couple of weeks now. At first, I used Azureus, simply because that's what Carl initially recommended, plus it supported RSS feeds. Then, I discovered µTorrent.

Azureus is by far the most popular BitTorrent client at the moment. However, it is written in Java, and consumes a lot of memory. "But, memory is cheap," you might say. True, but my seeding system only has 256MB in it to start with, so I'm looking for something that is REALLY light on the physical memory.

That's the niche that µTorrent fills. The whole program is contained in a small 133kB executable, and it hardly consumes any memory when it runs. For example, it's been running for a couple weeks now, and I'm currently seeding 23 torrents, and the process is only consuming 2.8MB of physical memory and another 5MB of virtual memory. Compare that to the 40-60MB of physical memory that I sometimes saw Azureus using!

I'm not sure how well the built-in RSS capabilities of µTorrent are. To be truthful, after discovering that it has a built-in directory watcher that will automatically load any .torrent file that is placed into a drop directory, I just rolled my own RSS consumer.

Note: There was originally more info here that detailed the small "PwopWatcher" app that I created to consume the Pwop RSS feeds. I moved that content to it's own post:

http://jasonf-blog.blogspot.com/2006/02/pwopwatcher.html

I'm Trapped in a Time Warp

If anyone is able to read this, please help! I'm blogging from my room at a bed and breakfast in Punxsutawney, Pennsylvania. I've been waking up to the same Sonny and Cher song every day for about a year now, and no matter what I seem to do, tomorrow always turns out to be February 2nd!



(Happy Groundhog Day)

Wednesday, February 01, 2006

Database Performance Increase Tip: Avoid MS-Access

A client of mine wanted to put together a tool to help them determine the profitability of their products. They didn't have a big budget, so initially (last year), they opted for a MS-Access based tool, since they already knew (or thought that they knew) how to use Access. They were also under the assumption that this would be cheaper than designing and building a purely SQL Server based solution (because they don't understand SQL Server).

So, my company developed the tool for them to take a bunch of overhead numbers and distribute the associated costs for a period of time over the list of invoice line items for the same period. Using Access. From a desktop client.

Ok, that's fine. The tool worked really fast with the sample data that they provided. Party on.

But, whoa! Their production data for just 1 month's worth of invoice data was multiple gigabytes in size - much too large to transfer into an Access database (which apparently has a 2GB file size limit). And, typically, the profitability was measured on a quarter's worth of data (3 months). Well, now we needed to introduce SQL Server into the equation in order to hold the large sets of data, while the business logic continued to exist in Access.

Did I mention that I was NOT involved in the architecture or construction at this point?

So, picture this: They would use Access to drive the data migration by linking to the source database in their data warehouse (using ODBC) and the destination database in SQL Server (also using ODBC) and pump all of those gigabytes worth of data through the desktop client. Besides being extremely slow, this also had a nice side effect of bloating the transaction log file (LDF) to the point that it actually filled the disk.

Then, once data for a period was in place on the SQL Server, a VBA module in Access would fire off, open about 15 recordsets to various lookup tables, loop through each row of the invoice data, search the 15 recordsets for corresponding data using .FindFirst, and then eventually update the row in the invoice table with calculated data.

This whole process, end-to-end, would literally take them 3+ days to run. They would start it on a spare machine, and check back days later to see if it completed, or if it errored out (in which case, they had to start over).

Ok, so the MS-Access solution is now being called a prototype, and I'm now in the picture to ensure that they get better performance out of the next iteration of the project. The obvious improvements that I'm making are to use DTS to pump the data from Oracle into SQL Server 2000, establishing proper indexes (including a clustered index that includes the invoice date), and to move everything out of Access. That VBA module is being ported to a stored procedure, and instead of using cursor-based processing, I'm manually converting everything to set-based.

(To be fair, a lot of these changes were actually identified by the original developer of the MS-Access developer before I became involved)

I don't have access to their production data at this point, but based on the data that I have, I'm predicting that the 3-days worth of processing can be completed in 10 minutes. That's 432 times faster (which, if I'm calculating correctly, is 21,600% faster??!), chiefly due to the logic being moved to the data tier instead of pumping data across the wire using ODBC and MS-Access.

I'm excited to see this run on the actual production system...

Warner Bros. In2Movie Service To Use P2P

Something caught my eye as I read Neowin this morning: Warner Bros. is going to start offering movies and television shows for download (obviously at a cost).

But, that's not what caught my eye. The article was a little light on details, but it seems that Warner Bros. will use Peer-to-Peer (P2P) technology to distribute the content. Looking into my crystal ball, I can foresee this service failing, and here's why:

P2P distribution technology, like Gnutella or BitTorrent, depends on users uploading in addition to downloading. As a user, I don't mind providing the bandwidth to upload because I see that as a tax for being able to download content faster. But, this concept only works when the content is free to start with.

As soon as I have to pay for content, then I'm not interested in sharing my uplink. I expect the content provider to have a really big pipe so that I can download everything from that one source, and I don't want to wait forever for the content to arrive. I don't believe that I'm alone in thinking this way, either.

So, if WB is planning to just distribute their content using BitTorrent, then they're going to find a vast majority of downloaders who simply leech from the swarm (leeching is when someone downloads without providing any uploading at all, or who turns off the BitTorrent software after the download completes so that no further uploading occurs). The result will be the opposite of what BitTorrent was designed to do: instead of downloads getting faster as more machines participate in the swarm, they will get slower because more demand will be placed on fewer nodes of the network.

Of course, I could be interpreting this incorrectly. Look at what Pwop Productions is starting to do by establishing a network of BitTorrent seeds through their Ambassador program. This is the future of P2P distribution, in case you haven't been paying attention.

Carl, who truly believes that his content must be of great quality, and must be free to the listener, assembled an army of volunteers (this blogger included). BitTorrent clients on these machines subscribe to an early feed that allows them to access torrents hours before they are released to the general public. The goal is that by the time a DNR, Mondays, dnrTV, or Hanselminutes show goes live, that there is already a number of machines seeding the content.

This benefits Pwop in the fact that there are lower demands on their network bandwidth, and benefits the listener in that they are able to download the content faster.

So, any major studio, like Warner Bros., should look to the Pwop model as inspiration for setting up their own distribution network. They can set up a grid of P2P seeder nodes scattered throughout the world, thus eliminating the need for one big central pipe. However, unlike Pwop, the studios cannot count on the same type of upload capability from their downloaders.