Support Manual

Site Statistics
All hosting accounts come with
HTTP-Analyze preinstalled and configured.
- We also have Google Analytics
and AWStats as options for you. Google Analytics makes it easy to improve your
results online. AWStats is a powerful log analyzer which creates advanced web
statistics reports.
HTTP-Analyze is a log analyzer
for web servers. It analyzes the logfile of a web server and creates a comprehensive
summary report from the information found there. http-analyze has been optimized
to process large logfiles as fast as possible.
In easier-to-understand terms,
HTTP-Analyze is a very powerful traffic analyzer that quickly and efficiently
delivers you statistics on the traffic that your web pages have generated. It
has a user-friendly graphical user interface (GUI) that by a click of your mouse
button will produce your traffic reports.
View
screen shots of actual statistics reports
How It Works
The web server is a program
running on a networked machine, waiting for connections from the outside world
to serve certain documents on behalf of a request by a browser.
To communicate, the server
and the browser use an asynchronous communication method called the HTTP (hypertext
transaction) protocol. It works as follows:
the user starts the browser and
types in an URL
the browser connects to the
given host and requests the specified document.
The web server handles the
request and sends out a response:
If this document exists, the
web server delivers it.
If it does not exist or if access is not permitted, the web server sends back
an error message instead.
The document delivered as an
answer to this request may contain inline objects. Inline objects are simply URLs
pointing to another resource, either a document, an image, an applet, a video/audio
stream, or any other addressable HTML object.

The browser then requests all
inline objects of the current page from the server using the steps 2 and 3 above,
before it can display the content of that page.
This communication method is
called asynchronous, because the browser sends out many requests for inline documents
at once (without waiting for a response from the server before sending the next
request) using different communication channels:

Since the browser's requests
are often handled by different server processes or different threads of a server
process, there is absolutely no relationship between the logfile entries caused
by the responses from the server due to a request of a document and it's inline
objects.
For example, the order in which
the server logs the successful transmission of the document itself and the inline
images contained therein is not predictable and depends on the type of documents,
objects, server speed, system and network load, and many other parameters.
What is logged?
Each and every response from
the server - whether it indicates success, an error, or even a timeout (i.e. no
response) - gets logged in the server's logfile. Since the server was hit by a
request, such a response is called a Hit. In other words, the total number of
hits must equal the total number of lines in the logfile minus the number of corrupt
and empty lines. A typical logfile entry in the Common Logfile Format looks like:
hostname-[01/Feb/1998:10:10:00
+0100]
"GET/index.html HTTP/1.0"200 4839
The hostname field contains
the full qualified domain name (FQDN) of the site accessing your server (see ÈSpecial
CasesÇ below). The next two fields usually contain a minus (`-') to indicate that
those fields are empty. The date is surrounded by square brackets ('[' and ']').
The next field contains the request. It contains the request method ('GET' for
example), the name of the requested document (URL), and the protocol specification
('HTTP/1.0').
The following field contains
the servers response code ('200' stands for an 'OK', while '404' would mean 'Document
not found', for example). The last field contains the size of the document (some
servers log the number of bytes transferred actually, while other servers log
the size of the document, which makes a difference if the user interrupts the
transfer before the document could be transmitted completely.
There are two other logfile
formats, the Combined or Extended Logfile Format. Those formats add the user-agent
(browser type) and the referrer URL (the page, which contains a link to the requested
document if this request for such document has been generated by following a link)
to the logfile entry. Those Combined or Extended Logfile Format append following
two fields to the Common Logfile Format (CLF) in one of two usual ways:
CLF Mozilla/2.0 (X11; IRIX
6.3; IP22) http://foo/bar.html
CLF "http://foo/bar.html" "Mozilla/2.0 (X11; IRIX 6.3; IP22)"
Note that in the second form,
the user-agent and the referrer URL are surrounded by double quotes, which makes
them ambiguous in certain cases such as erroneous referrer URLs, which contain
double quotes. Therefore, the first form should be preferred if possible.
The entries shown above are
the only information the server records in the logfile. There might be much more
information being transferred from the browser to the server, but although this
additional information is available through CGI-scripts running on your server,
it gets not logged in the logfile. Therefore, http-analyze can only show you a
summary of the information in the logfile - nothing more, nothing less.
Special Cases
Caching in the browser:
As soon as a page has been
saved in a browser's disk cache, the browser might send out conditional requests
for documents or inline objects. This conditional request ask the web server to
only send a document/object if it has been modified since the last time the page
has been requested (if the page is still in the browser's cache). This way, network
traffic is reduced somewhat, since documents must be transferred only if they
have changed recently. If such a conditional request arrives, the server will
respond with a Code 304 (Not Modified) status to indicate that the document
hasn't changed or with a Code 200 (OK) status if it has changed in the meantime.
Since the browser may be configured (and usually is so by default) to only send
out such conditional requests once per session and otherwise unconditionally use
the copy from the cache, you may not even see a Code 304 response if this
users visits your site again in the same session. Conditional requests are then
sent out only if the user terminates the browser session and later restarts the
browser.
Caching in a proxy server:
Organizations with a large
number of users - such as companies, universities, or online providers - often
use a so-called proxy server for mainly two reasons:
- Often such organizations have
a firewall to protect their internal network against intruders. This means, that
their network is logically separated from the rest of the Internet and that they
have to use such a proxy server, which is able to communicate with the inside
and the outside of their local network.
- To reduce network load somewhat,
the proxy server acts as a local copy machine: As soon as a page is loaded into
a browser through such a proxy server, the proxy saves a copy of this page in
it's disk cache much like a browser does in the scenario above. This way, documents
requested very often by users in the same local network need to be transferred
to the proxy only once, which then answers future requests for the same page from
it's local cache instead of connecting to the original web server the document
originated from.
Both forms of caching make
it technically impossible to count visitors or to track their way through your
web site. All you see in the logfile of your server is only a few initial hits
from the proxy or browser and probably some Code 304 responses resulting
from conditional requests sent out by the proxy or browser, depending on the preferences
settings of the proxy or browser.
Definition
of Terms
The statistics report contains
among others the following information:
the number of hits, 304's,
files, pageviews, sessions, data sent (in KB)
the amount of data requested,
transferred, and saved by cache (in KB)
the number of unique URLs,
sites, and sessions per month
the number of all response
codes other than 200 (OK)
the average hits per weekday
and for last week
the maximum/average hits per
day and per hour
the number of hits, files,
304's, sites, data sent by day
the top 5 days, 24 hours,
5 minutes and 5 seconds of the summary period
the top 30 most commonly accessed
URLs (hits, 304's, data sent)
the 10 least frequently accessed
URLs (hits, 304's, data sent)
the top 30 client domains
accessing your server most often
the top 30 browser types
the top 30 referrer hosts
the overview/detailed list
of all files requested
the overview/detailed list
of all sites by domain and reverse domain
the overview/detailed list
of all browser types
the overview/detailed list
of all referrer URLs
The following table summarizes
the meaning of all terms in the statistics report which are not self-explaining:
| Term |
Color |
Meaning |
| Hits |
 |
A hit is any response
from the server on behalf of a request sent from a browser. This includes any
response from the server, not only text files or documents. If, for example, a
HTML page has two images embedded, the server generates three hits if this page
is requested: one hit for the HTML page itself and two hits for the two inline
images. |
| Files |
 |
If the user requests
a document and the server successfully sends back a file for this request, this
is counted as a Code 200 (OK) response. Any such response is counted for as a
file. Again, "file" here means any kind of a file. |
| Code
304 |
 |
A Code 304 (Not
Modified) response is generated by the server if a document hasn't been updated
since the last time it was requested by the user and therefore there was no need
to actually send the files for this document. This happens if the browser (or
a caching proxy server between the browser and your web server) still has an up-to-date
copy of the page in it's local storage (cache) and therefore can display the page
without requesting the actual content. This technique is used to reduce network
traffic, but it also causes an inaccuracy in the statistics reports regarding
the number of visitors, because the browser or proxy usually sends only one such
a conditional request per user session if it still holds an up-to-date copy of
the file. However, the ratio between files and 304's reflects the efficiency of
overall caching mechanisms for at least those hits which made it's way to the
server. |
| Pageviews |
 |
Pageviews are
all files which either have a text file suffix (.html, .text) or which are directory
index files. This number allows to estimate the number of "real" documents
transmitted by your server. If defined correctly, the analyzer rates text files
(documents) as pageviews. Those pageviews do not include images, CGI scripts,
Java applets or any other HTML objects except all files ending with one of the
pre-defined pageview suffixes, such as .html or .text. |
| Other
responses |
ÿ |
There are much
more responses than only Code 200 (OK) and Code 304 (Not Modified) responses,
especially in the coming standard, the HTTP 1.1 protocol specification. For example,
the server could generate a Code 302 (Redirected) response if a page has moved,
a Code 401 (Unauthorized Request) response if access to the document is denied
or a Code 404 (Not Found) response if the requested page does not exist on this
server. |
| KBytes
transferred |
 |
This is the amount
of data sent during the whole summary period as reported by the server. Note that
some servers log the size of a document instead of the actual number of bytes
transferred. While in most cases this is the same, if a user interrupts the transmission
by pressing the browser's stop button before the page has been received completely,
some servers (for example all Netscape web servers) do not log the amount of data
transferred but the amount of data which would have been transferred if the user
would have completely loaded the page. |
| KBytes
requested |
ÿ |
This is the amount
of data requested during the whole summary period. http-analyze computes this
number by summing up the values of KBytes transferred and KBytes saved by cache
(see below). |
| KBytes
saved by cache |
ÿ |
The amount of
data saved by various caching mechanisms such as in proxy servers or in browsers.
This value is computed by multiplying the number of Code 304 (Not Modified) requests
per file with the size of the corresponding file. Note: Because http-analyze can
determine the size of a file only if the file has been requested at least once
in the same summary period, the values for KBytes saved by cache and KBytes requested
are just approximations of the real values. |
| Unique
URLs |
|
Unique URLs are
the number of all different, valid URLs requested in a given summary period. This
shows you the number of all different files requested at least once in the corresponding
summary period. |
| Unique
sites |
|
This is the sum
of all unique hosts accessing the server during a given time-window . The time-window
is hardwired to the length of the current month. This means that if a host accesses
your server very often, it gets counted only once during the whole month. Only
the sum of the unique hosts per month is listed in the statistics report. |
| Sessions |
 |
Similar to unique
sites, this is the number of unique hosts accessing the server during a given
time-window. This time-window is one day by default for backward compatibility,
but it can be changed with the option -u or the Session directive in the configuration
file. For example, if the time-window is two hours, all accesses from a certain
host in less than 2 hours after the first access from this host are lumped together
into one session. All following accesses more than 2 hours apart from the first
access will be counted as a new session. This way you may get an estimated number
of how many sessions are started on different sites to access your server. |
1 shown only
on the total summary page.