- Jeanne Kramer-Smyth (Chair) Building Archives Websites That Google Will Love
- Matt Herbison Online Collections Crawlability for Libraries, Archives, and Museums
- Mark A. Matienzo Findability in the Flow: Discovery through Linking
Disclaimer (borrowed from David Weinberger): Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words.
Jeanne Kramer-Smyth Building Archives Websites That Google Will LoveIf every page on your website has the same name, it’s like a box of chocolates – your user never knows what she/he will get without biting into each one (or drilling down into each page)
Keys to good (for SEO) page construction:
- Unique page title, fewer than 65 characters
- That title should have a clear keyword target phrase
- One URL per page
- Unique meta description, fewer than 164 characters
- Content-rich text – not a big image with a few words, but a big (or small) image with a paragraph of explanation
Analytics
- Where did they come from?
- Traffic sources and keywords
- When and why do they leave?
- Determine your bounce rate (visitor leaves immediately) and exit pages
- Where am I on a Google search?
- There are lots of good tools, among them Google Webmaster tools
Next steps/action items
- Look for the websites that rank higher for the phrases you care about. Do they have better material or are they just playing the game better
- Look at analytics
- Bake SEO into discussions of web site (re)design. Think about SEO ahead of time
- Follow some SEO blogs, such as SEOmoz Daily SEO blog
[Matt tweeted each one of his slides during his presentation. Twitter scheduling client? Brilliant!]
The goal is crawlability
If web crawlers can get to your stuff, your stuff won’t turn up in search engines, and users wont find your stuff
Search engines don’t search (query) your site. They can only browse or click links. Content that’s locked in databases (Deep Web/Hidden Web) can’t be searched
Questions to ask about your website:
- Does it block robots? Robots.txt tells web crawlers which part of your website to ignore, NOT to crawl. A lot of websites have old versions of these files, inadvertently blocking crawlers from seeing the good stuff. Example: Minnesota Historical Society robots.txt blocks directory /VisualResources.
- Using sitemap.xml. This tells web-crawlers what to look for, what pages to index. Its like a traditional sitemap for human users, but much more detailed
- Functions in a similar way to OAI-PMH
- Vanderbilt Television News Archive has more than 800,000 records, most of which have summaries. The sire has nested branching by month and year. So top level sitemap points to each year/month sitemap, each of which in turn point to those 800k items
- Long URLs? Long URLs contain parameters for search, but too long makes it hard for crawlers to crawl
- Resources on NARA website are a “nightmare of crawlability”: no browsing, enormous and unstable URLs, but the URLs don’t need to be that long
- Can you use short, permanent URLs (or permalinks)? If you have a database-driven system, each item probably has a unique identifier that you can use in the URL
- Is your content hidden in fancy features (such as Ajax, JavaScript, Flash)? Then it is hidden from crawlers too
- Don’t let anyone sell you a Flash-only site. It looks beautiful but content is hidden
- Do people get trapped too far in your site?
General approaches for beating the system
- Incoming links highlight resources wherever they appear in your site [unclear how this works]
- Get to know Google Webmaster Tools!
Why are we interested in SEO? Because we want to be found
The difference between Search and Discovery (Marchionini 1995)
[Good bullet list of differences. Full article here: www.ils.unc.edu/~march/getty.pdf]
Discovery happens elsewhere (Lorcan Dempsey) – people are going to learn about your collections and resources on places other than your site
Go where your users are
URLs are the currency of the web (L Dempsey again): URLs and the pages themselves can be shared by and among users for information, for influence and for good will
Think about how links impact they way people find your resources.
- Look for backlinks – that is, find out where people are coming from. Google analytics is useful for showing this
- Follow your nose: how do you link out to other resources? How is your information related to other pockets of information on the web
The Facebook “Like” button is more than a silly thing folks can do on FB. You can add a Like button to any page so users can easily link to your site on FB.
You can “Like” real things. The Open Graph protocol, designed for pages about real world thing
Wikipedia: there are legitimate concerns about how reliable the content is, but it has proven to be a community that is responsive to people’s concerns. People often complain “Wikipedia is not authoritative. It doesn’t cite sources!” The Wikipedia page on Yale University cites over 114 references. Categories in Wikipedia allow editors to group like sites.
Know your community: Wikipedia has standards that you should know and follow. See policy on “Subject and culture sector professionals,” which encourages us to improve Wikipedia by adding information to pages and linking to our sites from Wikipedia
Linkypedia new tool to see how/where your site is linked on Wikipedia. [Wow!]
Q&A
Question: If you are in a small shop with minimal IT support, how do you prioritize what to do (if you can only do 1 or 2 things)?
Matt: Do two things first: (1) Add intelligent, respectful links to your site from Wikipedia and (2) improve your sitemap.xml
Jeanne: Fix your page titles
Question: Two comments:
- Google can access a well-built Flash website – it is challenging, but it can be done
- Making a website usable and accessible to users with disabilities will make it more crawlable
Jeanne: Adhering to Section 508 will help your SEO, and improving SEO makes your site more usable for the disabled
Question: What about non-HTML content, such as wiki pages, on your site?
Jeanne: Ultimately it’s all HTML (except for images), but wiki pages are great for SEO because they are so well-structured
Mark: The web is getting more data-intensive, so we want to have multiple representations of our date, such as both human-readable and machine-readable. If you can provide information as structured data, it may allow you (and enable others) to do more experimental things
Question: What about CONTENTdm?
Matt: Item-level search engine hits often come from known-item searches. Any time you can aggregate about that item-level, you are more likely to be found. For example, CDM default is to show 20 items on a page. If you can group those 20 items [with a collective description?] that group is more likely to be found, and then users can choose the individual thing they want.
Matienzo’s slides are here: http://www.slideshare.net/anarchivist/findability-in-the-flow-discovery-through-linking
Comment by HaeB — August 14, 2010 @ 11:43 am
Thanks for the link! I’ve updated the post accordingly.
Comment by Rob — August 14, 2010 @ 1:19 pm