Why Use SV

What can you use the Search Visualizer for? Search engines such as Google and Yahoo and Bing are fine for most purposes, but there are some situations where the SV lets you do a lot more.

Here are some examples:

Some of these SV features are particularly useful for searching the Internet; some are particularly useful for handling large documents and large sets of documents.

With the SV, you can swiftly see the distributions of your selected keywords throughout a document. That tells you a lot. You can see how often your keywords occur, and where they occur; you can see whether they're scattered evenly across the document, or whether they cluster in just one or two places. You can also drill down into the underlying text, check the wording.

We've listed some hints, tips and suggestions which you might find useful. Each one is accompanied by an illustration of an SV search. One thing which you'll notice is that the illustrations vary in square size. That's a feature of SV which experienced users find very useful – different square sizes are good for different purposes.

We've included various sample texts on the site so you can try out the SV on them. We've started with Shakespeare, as an example of how you can get fresh insights into a much-studied set of texts by using the SV, and a selection of historical records from the American Civil War, as an example of how you can use the SV to handle large documents (the Gettysburg document alone is over half a million words long).

If you have any hints or tips about using the SV that you'd like to share with other users, please let us know via the "Suggestions" feature on the home page. We hope that you'll find the SV useful, and that you'll enjoy using it.

Finding the most relevant chunk of a large document

Finding the most relevant chunk of a large document

Finding the right document is often just the start – you then need to find the right section. That can take a long time with "find in document" functions on standard browsers. What you can do with SV is to see where the mentions of a particular topic occur within the document. Here's a record from an Internet search about aluminum in Land Rovers. "Land" is in red, "Rover" in green and "aluminum" in blue.

This is the only point in the document that mentions aluminum.

"Needle in a haystack" searches – finding the relevant information among a mass of false positives

Needle in a haystack searches

A lot of searches produce huge numbers of irrelevant hits. Here's an example. If you're searching for someone by name, then you'll get a lot of irrelevant records about people with the same first name or the same family name. Searching for the name within inverted commas (e.g. "Dr John Smith") reduces the number of irrelevant records, but also filters out a lot of potentially relevant ones, such as records about John Henry Smith or Dr J. Smith. With SV, you can look for places where the the words occur close enough together to be promising.

The example below is from one of the sample texts; it shows mentions of Jefferson Davis, president of the Confederate States of America. His name was sometimes abbreviated to Jeff Davis or Jeffn Davis, and he was sometimes referred to as President Davis or the president. Here's a result for an SV search on Jeff Davis president in an online archive, using the SV's "partial match" option to catch the different versions of his first name.

This search produced two hits on "Jefferson Davis" plus a hit on "Jefff" on its own. That turns out to be false positive on "Jefferson County". There's also a hit on "President Jefferson Davis" and a hit near the end on "president" which turns out to be "the president".

Not one of the usual suspects

Finding something which is not one of the "usual suspects"

A classic problem in online search is eliminating "usual suspects" which don't interest you; it's very difficult to do that without throwing out records which also happen to contain the topic that does interest you. Suppose, for instance, that you're interested in sources of renewable energy other than wind, wave and solar: if you tell your search engine to filter out records containing those three words, then it will also filter out quite a few records which mention not only those three words, but also other terms which are relevant.

The image on the left shows one of the results from a search for the keywords wind wave solar and renewable.

The keywords occur in bands, with a gap in the middle. That suggests that the text is structured into sections, and that there's a section in the gap which is about some other form of renewable energy. The blank section is, in fact, about tidal energy.

Getting a quick overview of a text

Here's the story of Romeo and Juliet in one picture. Love is shown in red, death in green. The play contains 25,822 words, condensed here into one image.

Getting a quick overview of a text

Opening scene: some mentions of death, to prepare the audience for this being a tragedy.

Lots of mentions of love.

A couple of mentions of death.

Lots more mentions of love.

More love.

Still more love.

Increasing mentions of death.

Only death.

Closing mentions of love.

Comparing texts to each other

Here's what happens when you compare how often two gospels mention scribes (in red) and Pharisees (in green). Why so few mentions of scribes in John? One tradition was that he was himself a former scribe, and was reluctant to criticise them.



Comparing Texts - LukeComparing Texts - John

Searching in a foreign language

Searching in a foreign language

It's fairly easy to use online software to translate a relevant document out of a foreign language into your own langugage, but how can you decide whether a record is relevant or not in the first place, if it's written in a language that you don't speak?

The images on the right are from an SV search of the Internet for records about bradycardia medication, in German. The German word bradykardie is in red, and medikament is in green. It's clear that the first record looks highly relevant, whereas the fifth looks like a false positive. With SV, you just need to put in the keywords in the other language, and then you can identify records that are likely to be worth running through a translation package.

Types of search available in SV

Entire web

You can use SV to find relevant websites more easily when you're doing an internet search. Here's an example. It's a search for a cottage in Scotland which has trout fishing. The SV image shows that all of the sites on this page contain mentions of trout, but only one of the sites actually mentions fishing in the text of its header.

Entire web

Single site

With SV, you can focus on a single site that interests you. Here's an example. It shows mentions of the words Iran and threat on the US government site www.whitehouse.gov You might also find this feature useful for searching an e-commerce site such as amazon.com, or on sites which contain an online copy of a text which interests you (e.g. a particular book).

Entire web

Online texts

We've put a variety of classic texts onto the SV site, in an SV-friendly format (a lot of online texts on the Web contain substantial amounts of extra material, such as long introductions). Here's an example. It shows a search of Midsummer Night's Dream using the SV synonym function, so that the words sleep and dream are treated as synonyms of each other (shown in red). We've done the same for fairy and fairies (shown in green). This visualization shows how these two themes occur separately at the very start of the play; however, they soon become tightly intermingled.

Entire web