Add Links to a Block of Text Automatically





Add Links to a Block of Text Automatically

Auto-Linker uses the Yahoo! API to add relevant links to keywords in any text.

Isn't the task of inserting links into web pages mundane? The Yahoo! Auto-Linker solves this problem by automatically substituting keywords in anytext with hyperlinks to top Yahoo! searches for that keyword. The result of this hack can be useful and inspiring, bizarre, or just plain amusing. The Auto-Linker accepts a bit of text via a web page form like the one shown in Figure and returns the same block of text with hyperlinks inserted.

The same web page, now illustrated with yReplacer


The Auto-Linker text-entry form


At http://blog.outer-court.com/yahoo/autolinker.php5, you can play with a working version of Auto-Linker and see how it works in detail. Once you're comfortable with it, you can build your own version with the code in this hack.

The Auto-Linker makes use of two Yahoo API services when it auto-links a given text. First, it finds the significant phrases within the text using Yahoo's Term Extraction (http://developer.yahoo.net/content/V1/termExtraction.html).

Second, it uses the Yahoo! Web Search (http://developer.yahoo.net/web/V1/webSearch.html) to find the top web page for this phrase. All that's left to do is add the corresponding HTML link to each occurrence of the phrase within the text.

The Code

The code for Auto-Links is PHP5, so save the following code to a file called autolinker.php5:

	<html>
	<body>
	<?php
	$text = ( isset($_POST['text']) ) ? $_POST['text'] : '';
	$rel = ( isset($_POST['rel']) ) ? $_POST['rel'] : '';
	$engine = ( isset($_POST['engine']) ) ? $_POST['engine'] : '';
	$text = stripslashes($text);
	
	$maxLength = 2000;
	if ( strlen($text) >= $maxLength ) {
		$text = substr($text, 0, $maxLength - 1) . '…';
	}

	echo '<h1>Auto-Linker</h1>';

	if ($text == '')
	{
	?>
	<p>This tool uses the Yahoo API to link significant words and phrases from a
	text you provide.</p>

	<form action="autolinker.php5" method="post"><div>
	<textarea style="font-size: 90%" name="text"
				cols="58" rows="8"></textarea><br />

	Relation:
	<select name="rel">
		   <option value="">[Default]</option>
		   <option value="nofollow">Nofollow</option>
	</select>

	&nbsp; Search Engine:
	<select name="engine">
		   <option value="yahoo">Yahoo</option>
 		   <option value="google">Google</option>
		   </select>

	<input type="submit" value="Submit" />
	</div></form>
	<?
	}

This code makes sure that if the text parameter has not been submitted, the script presents a <textarea> to be filled out. The user can also choose between links returned from Yahoo! or a Google web search.

Once the text is submitted to the script, the actual auto-linking takes place. Here is the else clause that triggers auto-linking:

	else
	{
		$sLinked = autoLink($text, $rel, $engine);
		echo '<p style="font-size: 105%;">' .
				$sLinked . '</p>'; 
		showCopyable($sLinked); 
		echo '<p><a href="autolinker.php5">[Auto-Linker Home]</a></p>';
	}

The showCopyable function just inserts a <textarea> where the user can copy the HTML source of the auto-linked result. The auto-linking core is in the autoLink function:

	function autoLink($s, $rel, $engine)
	{
		$s = strip_tags($s);
		$sRel = ($rel != '') ? ' rel="' . $rel . '"' : '';

		$url = 'http://api.search.yahoo.com/ContentAnalysisService/' . 
				'V1/termExtraction.xsd?appid=insert App ID&' . 
				'context=' . urlencode($s); //*See footnote

		$dom = new domdocument;
		$dom->load($url);
		$xpath = new domxpath($dom);
		$xNodes = $xpath->query('//Result');

		$counter = 0;
		$maxLinks = 10;
		foreach ($xNodes as $xNode)
		{

			if (++$counter > $maxLinks) { break; }
			$phrase = $xNode->firstChild->data;

			$phraseUrl = '';
			
			if ($engine == 'google') {
				$phraseUrl = getTopLinkGoogle($phrase);
			}
			else {
				$phraseUrl = getTopLinkYahoo($phrase);
			}

			if ($phraseUrl != '') 
			}
				$s = preg_replace('@( ' . $phrase . ')@ei', 
				'\' <a href="' . $phraseUrl . 
				'">\' . trim(\'$1\') . \'</a>\'', 
				$s, 4);
			}
		}
		$s = str_replace("\r\n", '<br />', $s);

		return $s; 
	}

The autoLink function takes the parameters s (the whole text), rel (the link relation, either default or nofollow), and engine (the search engine, either yahoo or google). Then the function requests the list of significant phrases from the Yahoo! API. Yahoo! recommends using a POST request for longer text, but a GET request, as used here, also works. Yahoo!'s returned XML looks like this, with all lowercase values:

	<?xml version="1.0" encoding="UTF-8"?>
	<ResultSet…>
		<Result>superman</Result>
		<Result>clark kent</Result>
		<Result>super powers</Result>
	</ResultSet>

The script applies an XPath expression to this XML to iterate through all Result elements to get their values. The preg_replace function searches for the phrase (in this example, blank before the phrase is to catch words only, and we make sure the replace is case-insensitive).

The link will be taken from either Yahoo! or Google, using these two functions:

	// We grab results from Yahoo's "REST" API again
	// using PHP5's nice native XML functionality.

	function getTopLinkYahoo($q)
	{
	
		$url = 'http://api.search.yahoo.com/WebSearchService/' . 
				'V1/webSearch?appid=insert app ID&max=1&q=' . 
				urlencode($q); //*See footnote
		$dom = new domdocument;
		$dom->load($url);
		$xpath = new domxpath($dom);
		$topUrl = $xpath->query('//Url')->item(0)->firstChild->data;

		return $topUrl;
	}

	// A tiny screen-scraping function avoids the overhead
	// of Google's SOAP API; this code will need
	// adjustments whenever Google drastically changes their
	// result-page HTML.

	function getTopLinkGoogle($q)
	{
		
		error_reporting(E_ERROR | E_PARSE);
		$dom = new domdocument;
		$dom->loadHTMLFile('http://www.google.com/search?' .
				'hl=en&q=' . urlencode($q) . '&num=1');
		$xpath = new domxpath($dom);
		$s = $xpath->query(
				"//p[@class='g']/a[@href]")->item(0)->getAttribute('href'); 
		if ( ! ( strpos($s, 'spell=1') === false ) && 
				! ( strpos($s, '/search?') === false ) )
		{ 
			$s = $xpath->query( 
				"//p/a[@href]")->item(1)->getAttribute('href'); 
		}

		return $s;
	}

Running the Hack

To run the code, upload autolinker.php5 to a web server and point your browser there. Add some text to the form and you should get a response similar to the one shown in Figure.

Some auto-linked text


Now that all pieces of the hack are in place, nothing stops you from quickly spicing up any text with relevant hyperlinks!

Philipp Lenssen



 Python   SQL   Java   php   Perl 
 game development   web development   internet   *nix   graphics   hardware 
 telecommunications   C++ 
 Flash   Active Directory   Windows