Saturday, March 31, 2012

3/31/2012 Flying through Essex County beaches

In reviewing my blog entries, on 5/23/2011 I wrote about “Mapping the beaches of Essex County”. It was a good start – I was able to research and discover the beaches and their coordinates, and then display them on Google Maps. That post stopped with a random path through the points, in a herky-jerky fashion, without beach names.

I have been teaching about Google Earth at a number of regional libraries, so I want to investigate making a “Google Earth tour” of the North Shore beaches.

The first aspect of any process is “get the data”. I did this a year ago with the construction of the EssexCountyBeaches spreadsheet in Google Docs.

The second aspect of the process is “get the data into the program”. I was hoping to just import the data into Google Earth, but that fuctionality is only available in the Google Earth Pro version (which costs $399 per user per year). Of course there are a number of other features, but let’s stick with the free version for now.

Open the EssexCountyBeaches spreadsheet in Google Docs:

Open Google Earth (I have set Essex County, Massachusetts as my Starting Location). Go to Add > Folder, and call it Essex County beaches. It will appear in your list of My Places:

Since we want to start our tour at the “Essex County overview”, go to Add > Placemark. Call it Essex County, and click OK. It now appears in your “Essex County beaches” folder:

Let’s go to the first beach (Joseph’s Beach in Nahant). Copy the latitude from the spreadsheet into the Fly To box in Google Earth

Then type a comma in the Fly To box, then copy the longitude from the spreadsheet into the Fly To box in Google Earth, then click the fly to/begin search button. Google Earth flys you to the beach:

Go to Add > Placemark, Name it “Joseph’s Beach”, and click OK. It now appears in your “Essex County beaches” folder:

Repeat the process for the other 48 beaches. Once in a while you will back-track. When that happens, continue and get the Placemark into your list, then just click on it in the list and move it up. When you are finished, double-click on “Essex County” in the Essex County beaches folder, and get taken to an overview:

To make a Tour, click on Add > Tour, and the record icon apears in the lower-left corner of the map:

Click the red dot to start, then double-click the first beach (Joseph’s Beach). After you zoom-in, pause, then double-click the next beach. Continue until you finish, then click the red dot to stop. When the tour plays, click the Save icon, and name it “Essex County beaches”. The tour now appears at the bottom of the list in the Essex County beaches folder. The Play Tour icon has a picture of a movie camera – click it to play the tour.

Click on this link to download the zip file containing Google Earth places and tour
download the zip file
Unzip the file, and then open the kmz file in Google Earth. Have Fun!

Wednesday, March 28, 2012

3/28/2012 Springsteen Tour 2012

My daughters gave me a great birthday present: a ticket to Bruce Springsteen at the TD Garden in Boston two nights ago. I've been a fan for a long time (saw him with Jackson Browne at the Villanova Field House in October 1973), and this show was just GREAT!

The next day I hit Tour infomation on backstreets.com click for link and wanted to make maps of his USA and Europe tour legs. Starting with Excel, I made two spreadsheets - one for the USA

and one for Europe

I want to use Tableau for visualization; it has a nice little geocoding database (ZIP Codes, City Names, etc), so I only had to find latiutude-longitude for a few cities. Europe, on the other hand, needed latitude-longitude populated directly into the spreadsheet, so I consulted a misc. site on the Internet.

After bringing the files into Tableau, I ran the data in a time-series, which shows Bruce jumping all over the country (as the little red dotted lines indicate)(March 9 - May 2)

But that is nothing compared to the chicken scratches that makes up his European tour (May 13 - July 31):

Thanks, Bruce, for all the memories, and for keeping it alive.

Tuesday, March 20, 2012

3/20/2012 Business Intelligence 103

I am curious about the geographic distribution of my sales, so I added two columns to my original spreadsheet: ZIP and state. After populating them (in Excel) and saving, I refreshed the file link in Tableau, and the two new columns were successfully brought in as new Dimensions:

If you drag the ZIP dimension over onto the lower-right “Drop field here” box, the program will automatically calculate the correct latitude and longitude.

The results are displayed on a map:

I then filtered for just cds or books, and filtered for the lower 48 states. I then colored based on cd/book:

Without running spatial statistics, it looks like both products are equally distributed throughout the United States.

Tableau has a wonderful “Time series” feature. Populate the Pages section with the Date dimension (twice: one for Year, and then one for Month-within-Year). Since we are dealing with less than 200 points, I want to Show All Marks and uncheck Fade:

The animation cannot be shown in this blog, unfortunately, so you have to take my word that there is no apparent temporal factor that is meaningful.

Make another map, but instead of ZIP, use state. I filtered on just the lower 48, and cd or book. I changed the Marks from Automatic to Pie, and color the slices by cd/book. increase the size of the Pies, and you get a very nice map of Sales by State showing distribution between cds and books:

Since I am familiar with the population distribution of the United States, my sales seem to mirror that population density. And the split between cds and books appear to be close to equal, with various random deviations. Well, at least I have a good understanding of my data, even though I do not have any actionable information.

Wednesday, March 14, 2012

3/14/2012 Business Intelligence 102

This posting will show, and discuss, amazon.com sales data using Tableau for business intelligence analytics. Working from the same amazon.com sales data set discussed yesterday, I constructed a spreadsheet with seven columns: id, Date, Weekday, Time (Pacific), buyer time zone, buyer time, and cd/book:


I saved the file as amazon_detail.xlsx, quit Excel, opened Tableau 7.0, and connected to the spreadsheet (because the data set is not large, I imported all of it into Tableau, instead of “connecting live”).

Tableau makes each non-numeric column a “Dimension”, and each numeric column a “Measure”. Additionally it includes a “record count” as a measure:

This first graph shows the Date and Number of Records selected, and Date is expanded down to Month:

Horizontal lines can be added, but the software automatically groups the records by date, displays each month (spelled out), and generates this very elegant graph in only a couple of mouse-clicks. We can “overlay years”, but the data is probably too small to get much value from it – that type of analysis would be used to compare September Sales in one year against September Sales in the other years.

A more-useful graph is one showing distribution of sales-by-product – I sold an audiobook, and a few VHS tapes, but all the rest are split almost evenly between books and cds:

Adding Date to the Columns gives us Product Distribution for each year:

(I could do Quarterly, or even Monthly breakdowns, but I am just looking at the Big Picture for right now.) You can see that an audiobook was sold in 2010, and the VHS tapes were sold in 2010 and 2011; cd sales were zero in 2008 and 2011, but look to be the big movers in 2012; and while book sales seem relatively steady, not much is happening in 2012. Note that each graph goes out to the ~43 level, allowing for easy visual comparison between different bars in different years.

If we remove the Date field, and drag in the Weekday field, we can see how sales have occurred by weekday over the life-of-the-account:

Since the size of each weekday-box is the same, I can see that there does not seem to be much difference between weekend-versus-weekday sales, or early-in-the-week versus late-in-the-week. If you add Year(Date) as a dimension in the Columns, the data seems to get a little too fragmented for my liking.
I want to do some analysis using Time (what is the distribution of sales throughout 24 hours? is the distribution different for books than for cds?). Let’s view the data to see how Tableau handled the buyer time field:

Tableau handled the buyer time field by bringing in each hour/minute/AM/PM, but assigned them all to 12/30/1899 (same for the Time (Pacific) field). After thinking about the data and analysis, I like it this way – I am currently only interested in what hour a client made a purchase. Using the Time (Pacific) field will tell me when I should do something (yesterday’s blog showed me when I should adjust my selling prices), and the buyer time field will tell me if cd-buyers are at different times of the day from book-buyers. To keep this blog from getting too long, I will go to the final graph: Number of Records in the Columns, Hour(buyer time) and cd/book in the Rows, Filtered for just cd or book, and Color by cd or book:

I see that books (orange) peak at 10 AM buyer time, while the cds (green) peak at 5 PM.Since I am expecting to move more cds in the future, this confirms yesterday’s price-adjustment-time of 4:30 PM.

Tuesday, March 13, 2012

3/13/2012 Business Intelligence 101

(perhaps this should be called Business Intelligence – the first steps)
Business Intelligence (or Business Analytics) is the process of analyzing data generated in the process of running a business. That analysis will, hopefully, result in information (“information” is data that is useful) that we can use to make better business decisions.

I have a Seller’s Account on amazon.com, and over the past 3 years, I have sold 167 items. [Note for any would-be Amazon Entrepreneurs – the price you sell at, plus the shipping & handling charge, minus Amazon’s cut, minus the actual cost of shipping comes out to about zero. The reason I do this is to get “my stuff” out of my house and into the hands of someone who actually wants it.] This is a sample of the information associated with my sales:


Since I believe that music collections are going to be digital (iTunes on your computer/tablet/cloud/phone, or wma files ripped to your Windows machine), the days of physical cds are soon to be over. [A personal note: when everyone had turntables and records in the late 60’s and early 70’s, I had a tape deck – I would buy an album for $3.25, record it, and then sell it to fellow students for $3.00 – a good deal for them, and a good deal for me. I could get 10 albums on a reel of tape, and no albums to lug around.] As a result, I have many cds for sale on amazon.com, and would like to know when is the best time of day to go into my account and adjust them to be the Low Price (because this is the way I purchase: if the quality is ok, I go for the Low Price).

Being in Massachusetts, I thought that a good time to adjust prices would be either at noon (for the lunch crowd) or 6 PM (for the evening buying crowd). You can see that for each order, Amazon gives me the order date and order time (in Pacific time), along with the item details. For this project I am only interested in the time; additional analysis can certainly be done on day-of-the-week, as well as music cd-versus-book. Maybe I will throw all that data into Tableau for another blog.

I made two Excel spreadsheets. The first spreadsheet had 24 rows (one for each hour), 1 column specifying the hours, 12 columns filled-in for each Amazon page (15 entries per page), and 1 column summing the 12 detail columns. The data in the second spreadsheet is a direct link to the summary column in the first spreadsheet, and then I made a horizontal bar chart for each hour (now in Eastern Time because that is what I understand). As I populated each cell in the first spreadsheet, the bars grew on the right:



When all was done, the 167 orders had two peaks (Noon, and 5 PM), with secondary peaks at 8 PM, 1 PM and 11 PM. Since Amazon sells throughout the entire US, I should not be surprised that the data is much smoother then I anticipated – 11 PM on the East Coast is still only 8 PM in California. Without further analysis (day of the week? cd versus book? actual mailing address (therefore time zone) of the purchaser? amount of time Low Price holds?), and assuming that my Low Price will hold for a few hours, I am comfortable setting prices at 4:30 in the afternoon.

Thursday, March 1, 2012

3/1/2012 Locations on Google Maps, Part 2

As I said yesterday, I want to get “real” Google Maps in this blog, not just screenshots. Let's just see if the "link feature" can get to the htm code on Dixon Spatial Consulting...
Click here for DixonMap5
Ok, that just links to the map, but does not make the map appear in the blog. I googled "putting google maps in a blog", and four interesting links are

How-To: Putting Google Maps on Your Blog

Add a Google MAP to your Blog.

How to easily add interactive Google Maps to your Blogger posts

Embedding a map into a website or blog

Since the last is actually from the Support area of Google itself, it will probably be the best, but I am curious and will read them all. [...time passes ...]
Ok, it looks like there is not an easy way to do this. Well, there is an "easy way", but it does not deliver the functionality that I have when I code my own maps (multiple locations, specific icons and tool tips, etc) - work through Google Maps.

How does this look?

View Larger Map

Well, actually, it does not look too bad. You can see multiple points, zoom-in/zoom-out, change to satellite view/terrain view/and even Google Earth view with 3D buildings in downtown Boston! - I told Google Maps "Bank of America branches in Boston, MA".

It looks like I can get "my maps" into this blog, but it takes a little more "poking around under the hood" than I am currently comfortable with (Modify the Template Code, then ADD HTML/Javascript Gadget). Maybe at another time...