NAME

cs::URL - manipulate URLs


SYNOPSIS

use cs::URL;


DESCRIPTION

This module implements methods for dealing with URLs.


GENERAL FUNCTIONS

get(url,follow)

Create a cs::URL object from the url supplied and call the Get method below. If the optional argument follow is true then redirections (301 and 302 response codes) will be followed.

head(url)

Create a cs::URL object from the url supplied and call the Head method below.

urls(url,results,inline)

Return all URLs reference from the page url via the hashref results, which on resturn will have URLs as the hash keys and the title of each link as the hash value. If the optional argument inline is true, return ``inline'' URLs (i.e. specified by SRC= and BACKGROUND= attributes) rather than references (HREF=).

urlPort(scheme,port)

Given a scheme and port, return the numeric value of port. If the port parameter is omitted, return the default port number for scheme.

undot(url)

Given the text of an url, remove and . or .. components.

search(engine,query,maxhits)

Return a URL string that can be used to query the specified search engine with the supplied query string. The optional parameter maxhits specifies the desire number of hits (or hits per page) to return; not all search engines support such an option.


OBJECT CREATION

new(url,base)

Create a new cs::URL object from the url string supplied. If base (a cs::URL object or URL string) is supplied


OBJECT METHODS

Abs(relurl)

DEPRECIATED. Use new cs::URL relurl, $this instead. Return a new cs::URL object from the URL string relurl with the current URL as base.

IsAbs()

DEPRECIATED. Test whether this URL is an absolute URL. This is legacy support for relative URLs which I'm in the process of removing in favour of a method to return the relative difference between two URLs as a text string and to generate a new URL object given a base URL and a relative URL string.

Context

DEPRECIATED. Return a URL representing the current context for the specified scheme. Use this URL's scheme if the scheme parameter is omitted. This is a very vague notion, drawing on the HTTP_REFERER environment variable as a last resort.

Text(noanchor)

Return the textual representation of this URL. Omit the #anchor part, if any, if the noanchor parameter is true (it defaults to false).

Scheme()

Return the scheme name for this URL.

Host()

Return the host name for this URL.

Port()

Return the port number for this URL.

Path()

Return the path component of the URL.

Query()

Return the query_string component of the URL.

Anchor()

Return the anchor component of the URL.

HostPart()

Return the user@host:port part of the URL.

LocalPart(noanchor)

Return the local part (/path#anchor) of this URL. Omit the #anchor part, if any, if the noanchor parameter is true (it defaults to false).

MatchesCookie(cookie,when)

Given a cookie as a hashref with DOMAIN, PATH and EXPIRES fields and a time when (which defaults to now), return whether the cookie should be associated with this URL.

Get(follow)

Fetch a URL and return a cs::MIME object. If the optional flag follow is set, act on Redirect responses etc. Returns a tuple of (endurl,rversion,rcode,rtext,MIME-object) where endurl is the URL object whose data was eventually retrieved and MIME-object is a cs::MIME object or an empty array on error.

Head()

Fetch a URL and return a cs::MIME object. Returns a tuple of (endurl,rversion,rcode,rtext,MIME-object) where endurl is the URL object whose data was retrieved and MIME-object is a cs::MIME object or an empty array on error.

URLs(hashref,inline)

Return the URLs references by the page associated with the current URL. The hash referenced by hashref will be filled with URLs and titles (from the source document - not the taregt URL's TITLE tag), using the URL for the key and the title for the value. See the cs::HTML::sourceURLs method for detail. If the optional parameter inline is true, return the URLs of inlined components such as images.

Proxy()

Return an array of (host,port) as the proxy to contact for this URL. Currently dissects the WEBPROXY environment variable.

AuthDB()

Return a cs::HTTP::Auth object containing the authentication tokens we possess.


ENVIRONMENT

WEBPROXY - the HTTP proxy service to use for requests, of the form host:port.


AUTHOR

Cameron Simpson <cs@zip.com.au>