Using the getmetadata JSON service with a URL

getmetadata returns metadata from a URL. The URL can point to most file types that contain text, e.g. HTML, Word documents, PowerPoint presentations, PDFs, Flash and more.
Note: The URL can include URL encoded portions.

getmetadata is invoked by sending a request to http://service.metaglance.com/api/json/getmetadata together with your key and a URL. This GET-only HTTP JSON web service then retrieves the file referenced by the URL, analyzes the file, and returns metadata. (See Metadata Overview and Metadata Details)

Note: To get results from a text passage instead (for example the body of an email or a comment on a blog), see getmetadatatext.

Parameters

Two values are required by getmetadata: key and path. There is an optional third parameter: identifier.

Your key is displayed in the sidebar to the right. If you don't see it then log-in and create a key, or sign up for a free account.

The path value should start with http:// and can point to any of the following file types:

  • Web Pages: .html, .htm
  • Microsoft Word: .doc, .docx
  • Adobe PDF: .pdf
  • PowerPoint: .ppt, .pptx
  • Adobe Flash: .swf
  • Excel: .xls
  • Plain Text: .txt

The identifier value is the identifier of the resource for which metadata is being generated and should be a string. If it is omitted, it will default to the path. If it is included, it will be returned verbatim by the service.

Other file types, such as images, will return metadata, but the metadata will be limited since there is no text to analyze. At the present time getmetadata only completely analyzes American English text.

Examples

Below is the full query format. Be sure to replace <yourkey> and <yourpath> with actual values.

http://service.metaglance.com/api/json/getmetadata?key=<yourkey>&path=<yourpath>

To use this in a simple form try the following:

<form id="tagurl" method="get"
action="http://service.metaglance.com/api/json/getmetadata?key=&path=">
<input type="hidden" id="key" name="key" _cke_saved_name="key" _cke_saved_name="key" value="XXXXXX" />
<input type="text" id="path" name="path" value="http://"/>
<INPUT type="SUBMIT">
</form>

The web service will return a JSON object, like the one below.

{
"averagewordlength": "5.936139106750488",
"description": "CNN.com delivers the latest breaking news and information on the latest top stories, weather, business, entertainment, politics, and more. For in-depth coverage, CNN.com provides special reports, video, audio, photo galleries, and interactive guides.",
"format": "Web page",
"identifier": "http://www.cnn.com",
"language": "en",
"medianwordlength": "6.0",
"mediatype": "Text",
"mimetype": "text/html",
"readinglevel": "8.047667952548075",
"readingtime": "504.5214511304348",
"sentencecount": "115",
"size": "105308",
"subject": [
"cnn",
"travel",
"news",
"CNN news",
"CNN.com",
"CNN TV",
"news online",
"breaking news",
"U.S. news",
"world news",
"weather",
"business",
"CNN Money",
"sports",
"politics",
"law",
"technology",
"entertainment",
"education",
"health",
"special reports",
"autos",
"developing story",
"news video",
"CNN Intl",
"video",
"e-mail",
"password",
"sign",
"log",
"change world",
"updates",
"deadly attack",
"ugandan king turns",
"robots",
"volcano ash health",
"popular stories right",
"celebrity quotes",
"parents",
"online news trivia",
"promises",
"air travel",
],
"syllablecount": "2357",
"title": "CNN.com - Breaking News, U.S., World, Weather, Entertainment & Video News",
"uniquewordcount": "817",
"wordcount": "1499"
}

For more information on the above values see the metadata overview or details pages.