the uptimer journal

SERVICE CHANGES AT BASICSTATE
2011-05-20

Effective 2011-06-01, email alerts will no longer be on a 30 minute enforced delay if an account is not funded. Alerts will revert to the delay set in the control panel by each user. Users who have set a delay under 5 minutes should consider setting the delay at 5 minutes. This gives the system the opportunity to cancel the alert if the automatic retest is successful. Please check your alert settings to allow for the new scheduling.

Concurrent with the return of free immediate alerts, testing charges will be phased in. All accounts that are less than 12 months old will be automatically credited with a further 6 months of testing for free. Accounts older than 12 months will be automatically credited with a further 3 months of testing for free. Thereafter, each month of testing will be charged at two service credits. This is equivalent to 50 cents per month as each credit costs 25 cents. If an account remains unfunded at the end of the grace period, testing and alerts will stop.

New accounts opened after the actual transition date will receive 10 usage credits upon signup. These credits may be used at the discretion of the account holder. For example, it is sufficient for 5 months of monitoring for a single site. It is also sufficient for 1 month of monitoring for 5 sites. Note that host names and user names are not transferable between accounts.

For those who have never purchased credits, the purchase link can be found on the account history page of the control panel. It is identified as purchase usage credits. Credits are sold as a package of 50 credits for $12.50. Payments are processed through Paypal, but a Paypal account is not required.

SERVICE CHANGES AT BASICSTATE
2010-02-10

update: how to reduce costs intelligently

Today marks the start of the fourth year of life for basicstate. In that time we have monitored servers and sent failure alerts for many thousands of sites. All of this has been accomplished without users paying a cent except for those wishing to have SMS alerts. We would like to thank our good friends at rackco.com for their generosity as a donor.

Now it is time for us to try to stand on our own two feet.

Having said that, we still wish to provide mostly free services. There are some big multi-nationals and government agencies using the services. But, there are also some grassroots organisations using the services who are doing stellar work.

Therefore, we are opting for a 4 point plan that modifies the way services are delivered. The changes will have effect as of March 15/2010.

Testing will continue to be free as always with unlimited numbers of sites.
SSL tests will be withdrawn and existing https tests converted to http tests targetting the same host name.
Free alerts will continue to be offered, but an initial delay of 30 minutes will be introduced.
Alerts without delay will be offered as an option. Users will now be able to disable the default email account an alert target.

Current SMS subscribers should recognise that these changes have no effect on them whatsoever.

In short, testing is free, delayed alerts are free, but immediate alerts are charged at 25 cents per alert.

Those who can accept the delay will continue to pay nothing, while those wanting instant alerts have the choice of upgrading. We are pretty sure that knowing your server went offline is worth 25 cents if you are running a higher profile site.

Finally, for those who have more complex monitoring requirements than those offered here are reminded of our professional service offered at exactstate.com The service offers monitoring of multiple protocols using simultaneous probes from multiple data centers. This might be just the right time to make the switch.

KEEPING THE SYSTEM CLEAN
2009-10-22

As most of you know, the monitoring service at basicstate.com is provided free of charge. A recurring problem with free services is that some users treat it with less respect. One particularly egregious case is the hosting company that has created over 1200 active tests. That would be fine except for the fact that their email server silently refuses connections from us. This causes the alerts and reports to be requeued for delivery for a period of 24 hours. Why they continue as subscribers when we cannot send them alerts or reports is beyond us. Sending them manual emails is hopeless, because they go undelivered too. There are a number of others with fewer tests who have seemingly abandoned their accounts.

Abandoned accounts still consume the same resources as active accounts. The difference is that the resources are expended without any benefit to anyone. Additionally, the resources are pulled away from active users.

Abandoned accounts also have a higher than normal incidence of dead servers. These dead servers consume more resources in testing and alerts than live servers. And, these dead servers skew the results of the global stats page.

To address this sad situation, we have started accumulating records on permanent delivery failures for all accounts.

Once enough records are available it will be possible to determine which accounts are unreachable on a permanent basis. The logic allows for servers that happen to be dead for a number of days. It also allows for servers that intermittently have problems.

The remaining accounts that are determined to be abandoned will be terminated. Unfortunately, it will not be possible to give notice since their email servers are either unavailable or will not accept conenctions from us. This grooming of dead accounts will be an ongoing automated process going forward.

Please be assured that accounts having temporary difficulties will not be terminated. That is not the point of this exercise. We simply want to shed ourselves of some selfish users who have no regard for the costs of running the service, or for the well being of active users of the service. This post is simply for historical purposes to show that nothing nefarious is being done.

GETTING A SECOND OPINION
2009-10-15

If you ever want to do just a quick http test then try newsreports.org.

It's great for those times when the host says it's your local connection, and you say it's not. Of course, if you are a host, it works the other way around too. You can also reach out and test from someplace outside your own network.

The next time someone starts giving you a headache about the server, just point them to newsreports.org. As a matter of fact, just tell them about it before it happens.

GO PRO WITH EXACTSTATE
2009-10-01

This should be good news for all those who have been asking for more features. The pro version of basicstate.com is now available. exactstate.com is now in general release. Be sure to check out the 2 for 1 signing bonus. It can cut the cost in half.

The primary enhancements are:

testing period down to 1 minute
configurable testing url
parallel testing from multiple data centers
tests for http, https, smtp, imap, pop3, ftp, telnet, ssh, custom tcp ports
cross-account publish/subscribe model for sharing alerts
more reports and charts(in progress)

Like all services in the group, the payment model is strictly pay for use. There are no monthly minimums to deal with. The full details are on the pricing chart.

So, it's the same service approach. But, with better features!

Account migrations are available. Just sign up for a new account, fund the account, then send an email to support. Your existing tests at basicstate.com will be moved over to exactstate.com. Multiple accounts can be migrated to a single account if desired. Be sure to check out the 2 for 1 bonus offer before deciding on your first purchase.

If you currently have multiple users, please see the manual first. The new cross-account features mean that each user should have their own account. So, if you are migrating, please ensure that each user has signed up for an account. Then, add them as alert subscribers using the group membership page before asking for the migration. This way all alerts will be setup for them automatically during the migration.

STAYING ALIVE AFTER THE SERVER CRASHES
2009-09-11

Subscribers to server monitoring services use the services for one reason and one reason only. They care about web site availability. When their web server goes down, they want to know about it.

But what happens then? Do users have to wait until the server is available again? Well, the good news is that the answer is no. It is possible for a server to go down without the web site going down if you have done the right things.

One of the approaches you can take is to use a dns service provider for global failover capabilities. The dns based failover system monitors multiple servers and serves the correct dns record depending on specified failover rules.

Some sites don't use just failover. Instead, they choose to use global load balancing in addition to failover. In that scenario, clients are served from multiple servers in geographically dispersed locations. This has the effect of directing clients to the closest server for the best speed. However, when a failure takes place, the failed server is taken out of rotation, and the users are directed to the remaining servers. This type of architecture takes best advantage of bandwidth that is already paid for. It might as well be used instead of wasted while the server is on standby. Some sites also designate hot-standby servers to take the place of a failed server.

So yes, basicstate.com will tell you if a server is out of action. But, edgedirector.com can keep your site alive.

By the way, edgedirector.com is also a good bet if you find that you are having problems with dns performance. You might not need it today, but you might as well go read about it today, just in case. All of the links open in a new window so that you won't lose your place here.

UP AND COMING WEB DESIGN SITE
2009-05-27

The Design Live decided to survey uptime monitoring solutions, and of course basicstate made it to the top of the list! The site deserves a look because the articles are well written, informative and in depth. It is definitely worth reading if you are design oriented, so don't forget to take a look at the home page.

OPTIMISING WORDPRESS PERFORMANCE
2009-05-26

Some Wordpress sites are noticeably faster than others. Getting the best performance out from a piece of software is an art. Some good tips can be found in this article. Be sure to also check out the external links at the bottom of the page.

NEW FAILURE TICKET ACCESS AND NOTES
2009-04-03

There is now a new way to access detailed information about test failures. The information can be accessed from the trouble tickets menu selection.

At the first level this new feature presents a reverse chronologically ordered list of failures in the preceding 30 days. Unresolved problems are marked with a red ball icon to indicate that attention is required.

A detailed view of the particular test failure is accessed by clicking on the name of the test. This brings up a detail page with complete information on the failure. The information includes the usual diagnostic information that is based on the various stages of the test. Please remember that this is a best guess based upon the limited information available to us when the test is performed. It is a good place to start looking for the problem, but the rest is up to you.

There is space for you to add optional user notes for future reference. The available space for individual notes is set at approximately 8000 bytes.

NEW DAILY SERVER UPTIME DATA POSTBACK

A number of users have asked for the ability to display their server uptime stats on their sites. basicstate.com is currently testing the postback of uptime statistics to the origin server for this purpose.

The data can then be used to create an uptime page on your own site, for private reports, or even to create emails for your own clients or associates.

The postback takes the form of a http GET request from the basicstate.com server to the server that was originally tested. The postback will take place once a day just after the daily emails are sent.

An example postback is shown below expressed as a url for the site example.com:

http://example.com/basicstate/?date=2009-01-19
                              &pct=98.12
                              &dns=0.375
                              &con=0.819
                              &req=0.819
                              &tfb=0.848
                              &tlb=0.922

note: url has been line wrapped for screen width

Therefore, the site is example.com and the page is /basicstate/

query parameters:

date:   record date formatted as yyyy-mm-dd 
pct:    uptime percentage
dns:    dns resolution time      (seconds)
con:    socket connection time   (seconds)
req:    request transmitted time (seconds)
tfb:    time to first byte       (seconds)
tlb:    time to last byte        (seconds)

The page location /basicstate/ is arbitrary. It is a directory location which allows any site using any dynamic request processing technology to process the incoming request without having to specify an extension.

An application note will be posted in the application notes section after release. In the meantime, if you have been looking for this feature, you can safely start developing based upon the above information. The above specification is locked for this release. Future enhancements may add variables, but not change the currently defined variables.

The application note will also feature contributed processing code with or without author attribution as requested by the contributor. So, once you have working code and want to make a contribution, please submit it by email for inclusion.

Since the update is once a day, We suggest that the processing script create a static result page for the most efficient use of system resources.

Now, some words on conditions of use:

Private use of the data by the site is permitted without restriction, however public display of the data requires attribution in the form of a visible, plain link to the basicstate.com home page, without the no-follow attribute. An example of an acceptable link in source code format is:

<a href=http://basicstate.com/>
uptime monitoring provided by basicstate.com
</a>

This would display as:

uptime monitoring provided by basicstate.com

Sites that display the data publicly without a link will have data transfer privileges revoked permanently. Of course, the link wording may be altered to suit the language of the site.

Data transfers may only be enabled for sites under the control of the account holder. We do not want to receive irate emails from webmasters who are seeing unauthorised requests. So, please only use the data transfer facility for your own sites.

The data transfer facility will be managed by means of status code monitoring. If multiple http 404 error codes are returned from the postback, the postback will be automatically and permanently disabled for that site.

NEW RELEASE POST MORTEM

The new release has been in production for about 12 hours now. A report cycle has passed and the data looks good. As a matter of fact, the data looks better than ever. The most significant change is that the dns lookup times are more realistic because the software is not caching the dns lookups any more. The related improvement is that changing ip addresses for a site will no longer result in tests being stuck at the old address improperly. This alone has reduced the number of test failures when looking at our dashboard. As this is being written, the system looks very stable and responsive.

Of course there were problems. We are sorry that it took some hours to find the root cause. In the meantime, a small, but significant number of users were each flooded with hundreds of false alerts. From the response, it seems most affected users either recognised them as being part of a transient problem, or had actually read the upgrade announcement in the daily reports. Whew! Only two nastygrams!

Four lines of code were commented out in testing because they were incompatible with the production database stored procedures. These four lines of code did not make it into production. The result was that every encountered existing failure was categorised as a new failure. As failed sites were revisited, additional alerts were scheduled and sent even though earlier alerts had already been sent. The problem affected accounts with failed servers which were not brought back online. Some users received hundreds of alerts. Of course, some of the sites were old tests which users had not marked as inactive in their accounts. These sites had been previously been taken offline and the account holders had simply waited out the alert lifetime.

Finally, in the last round of testing we were able to estimate the number of ssl sites that would be affected by the stricter testing standards. It appears that about ten percent of sites will not pass the new tests. Amongst the reasons for not passing are: self signed certificates, invalid certifcates, unknown certificate issuers, and protocol implementation. Some of the sites were visited manually using Internet Explorer 6.0 sp1 and Netscape Communicator 7.2 as verification. The results were consistent with the automated test results. In every case the problem indicated by the test was also seen in the manual verification test. It would seem that if a user receives an alert complaining about an ssl enabled site, then it would be wise to review the ssl implementation for that site in particular.

We pushed the install date a little bit to have as much break-in time as possible before the holidays. The benefit is a more reliable system so that everyone can go about their holiday season plans with less worry.

NEW SOFTWARE RELEASE

We will be deploying the newest version of the custom testing engine. This involves both good news and bad news. Hopefully, more good than bad.

The good news comes on three fronts. First, the new release addresses the problem of dns caching. The current release uses a third party library that caches dns entries until the engine is restarted. Since the engine is intended to run forever, web sites that change ip addresses have been marked down when they should not be. The cache behaviour also masks dns problems and slow dns responses. No matter what we did, this behaviour could not be modified. So, we replaced the library with our own code. Second, using our own code allows for much stricter ssl certificate analysis. This is helpful in alerting administrators to the fact that their visitors might not be receiving the optimal experience with respect to ssl. Third, the new code is much more efficient than the third party library. This will allow us to do many more tests with the server resources at our disposal.

Now for the bad news. And, it really is not so bad. You just need to be aware of it. First, there will be discontinuity in the timing numbers in your data because of the change in dns query behaviour. This means that you will likely see a longer time to last byte total. Second, ssl certificate alerts may be issued for ssl certificates that used to pass inspection in the old version but not the new version. Since the new version inspects the server certificate in the same way a browser does, it is not actually bad news to know about problems. But, you should be aware that it might happen. Anyone who is aware that they are not truly ssl standards compliant on their ssl sites may want to address that ahead of time. We noticed during testing that some ssl sites are using self signed certificates, or certificates that do not match the host name. These sites will start getting ssl certificate alerts. Third, there may be small problems with the rollout of the new software. If they are small we will stick with the new software while we address the problems. If they are major, we will rollback to the old version, lick our wounds, and try again.

As for timing, we would love to be able to do this when everyone is awake and at work, rather than waking someone from a good sleep. But, the reality is that with users from around the world, no matter what time is chosen, it will be night time somewhere. However, we will at least make sure that it is midweek rather than a weekend.

If you notice any problems that persist, please feel free to send us an email. As always, the more details you include, the better we will be able to reproduce and understand the problem.

SERVICE INTERRUPTION

Services from basicstate.com were interrupted by a ddos reaching a traffic level of 8Gbps+ on November 21. The ddos was not directed at basicstate.com, but it affected all servers in the neighbourhood. The actual target was a particular .ch registered site and a .com registered site used as a ddos mitigation alias.

The ddos was part of a larger 20Gbps+ ddos affecting the dns servers of the largest network providers and registrars in Germany. The problem reached its peak on November 21, affecting some 2.7 million domains. This left most .de domains unreachable from around the world.

At the present time, there is still some network activity from around the world, but it is greatly reduced. This allows us to resume operations.

Operations will resume in stages. The testing platform will start first, followed by email services, and finally, the basicstate.com web site.

There will be anomalies in the daily report due to the fact that thousands of new dns lookups have to be done during the first cycle. During the initial restart, no failures will be reported during the warmup period due to the risk of false alerts. Other effects may be noticed, but it is unknown what those will be at the current time.

Our hosts rackco.com have been very helpful to basicstate in this time of need. They stood with us the whole time during this difficult period. Technical personnel were always immediately available 24x7 to help us assess traffic levels, adjust network routing and apply filters. We are extremely grateful for their help, both during this episode, and at all other times. basicstate.com operates on colocated equipment, but managed dedicated and vps are also available from rackco.

DIRECT SMS ALERTS NOW AVAILABLE

SMS text alerts have always been available to users through the free sms to email gateways offered by various sms carriers.

However, not all carriers have gateways available. And where they are available, experience has shown that there can be problems.

First, some gateways limit the number of sms messages in a given period. Some go as far as to count all messages to all subscribers going through the gateway as a single total in calculating the limit. This can be very limiting when multiple alert subscribers use the same carrier.

Second, many gateways treat sms text messages submitted by email gateways as lowest priority. This can cause delays or even missed messages during busy periods.

The most direct route available for a sms text message to reach your phone is through the carrier SMCC. The SMCC is the carrier switching center for text messages addressed to their subscribers.

This optional capability is now available to all basicstate.com users as an additional alert delivery channel. The word optional is emphasised because it requires the purchase of sms message credits.

All monitoring and alert services, including sms through email gateways, continue to be free. If sms via email is working well for you there is no need to use the direct SMS option.

Direct SMS text alerts are simply an addition to the choice of alert methods that an account can choose to use.

We hope that this option will please those who have asked for it in recent months.

As a bonus, all existing accounts now have a balance of five free sms message credits. New accounts will also get the same credits at the time of creation. These credits enable account holders to test their sms devices for network reachability without incurring any charges or buying credits.

This addition means that there are is a new menu entry. Account History will lead you to a consolidated history of optional credit usage history credit purchases. The purchase link is available from that page. And, finally, there is a test page for sending a test message to all three types of alert destinations. This allows users to ensure that a alert destination is reachable in advance. It is directly accessible from the purchase page.

Please note that limits on the number of alerts sent for a single failure have been introduced.

Regular email alerts are now limited to 12 for any single failure. This is intended to save recipients from being flooded with dozens of messages when their email server has gone down with their web server. Our advice is to never use an email account located on the same server. If you do, and the web server goes down, the alerts have no way of reaching you.

SMS alerts are now limited to 3 for any single failure. This is intended to flooding your phone with alerts. And, in the case of direct SMS, to avoid excess charges.

BLACK IS THE NEW BLACK

If you've been here before, you know that the site is sporting a whole new look. Some changes are subtle, and others more obvious. Among the obvious is the colour scheme. Older users will remember a time when all screens were green or white on black. It was called green screen. They were always much more relaxing to work with and that is why it has been brought back at basicstate. Inside, you will find that the working area to the left has been extended where appropriate. When you log in, the private menu will be at the top where you can see it. This new look was introduced at this time because it was already part of the new pages for sms credits. Those pages are not active yet, but will be soon. Look for the announcement in your daily report emails.

THE WEB IS NOT MAIN STREET

A common mistake that is seen over and over again on the web is the presumption that business on the web operates just like business on Main Street. Big mistake and a recipe for disaster. Build it and they will come applies even less to the web than it does to Main Street. If you open a corner store on Main Street, at least some potential customers will stumble in out of curiosity. The store is there, it is right in front of their eyes. The web is entirely different. Building a web site and waiting for customers is like building a corner store in the middle of the woods and hoping that customers will come by before all the milk spoils. It's not going to happen. The site might offer the greatest widget in its niche. It's still not going to happen. Not in a million years. Or more aptly, not in a million web sites. As of the writing of this screed, there are over 50 million active web sites. There are estimated to be 200 million active web users. The simple arithmetic suggests 4 users for every site. 4 users is hardly enough traction for a site to achieve any type of success. Friends and family could probably exceed this number handily. Even 4 users is overly optimistic given that the top 100 sites get the majority of the available eyeballs. The only way to get more than your fair share of eyeballs is to get out there and drag them in. In other words, plain old hard work. Every single day.

PLEASE THINK AND LINK

There are some things that thoughtful people need to take a stand on. Taking a stand means doing any little thing possible to help in a situation. Films and documentaries by social commentator Michael Moore are often relegated to non-prime time slots on public television. This is a shame because the ideas presented ought to be made widely available and considered by the greatest number of people possible. To that end, you can read about his films at Michael Moore's site.

Related in theme, but just recently introduced, is the new satellite site for the Canadian Centre for Policy Alternatives, growinggap.ca - the growing gap between the rich and the rest of us

Of course, "think and link" has a certain ring to it, but really, it should be "think, comment, and link".

IT'S THE LITTLE THINGS

A rock solid application instills confidence in users. Unexpected results are disconcerting and frustrating for users. No matter how easily explained by the programmer, the user will not be a satisfied user.

For example, the title of this entry has an apostrophe in it. It's a punctuation mark that appears in many every day uses. But, certain uses of the apostrophe in combination with javascript will cause unexpected failures. This fact has probably hit every coder who has ever tried to use javascript in a non-trivial way.

The easy way out for the coder is to have the application refuse input of data fields containing the apostrophe. The harder way and the one that is riskier is for the coder to find all the places where the apostrophe will interfere with the application and code defensively around it to allow for the display of the apostrophe as intended by the user without blowing up the application. Guess which approach is going to make the user happier?

Harder work for the few, but many more happy users. Seems fair to us!

Come to think of it, the coder ought to feel some pride in taking the correct path rather than the quick way out.

JUST A REGULAR PLEASE

Well, the regular expression was not quite right. While cruising the testing stats, we found a test spec that had been improperly truncated. Having a new bug crop up as a result of a previous fix is not to be encouraged, but it did give us a chance to demonstrate our approach to customer service. Yes, customer service. Even though our monitoring services are free to all subscribers, the subscriber is still, in our eyes, a customer.

An email was sent to the subscriber administrator querying what the host name should be with a promise of a manual fix and immediate attention to the code. In the meantime, the subscriber had sent an email reporting the problem and specifying the fully qualified domain name required. We manually corrected the data record, reactivated the test and informed the customer of the manual fix. In his words: "wow ...I did not expect to get it corrected that fast !!!".

Why wow? Well, the problem was reported at 4:07am, a remedy plan sent at 4:19am and a final resolution reported at 4:23am. Fixing the code took another couple of hours. We can't do this all the time, but we can try.

Just like you, we are tired of making telephone calls and sending emails to customer service departments which have no effect. This happens because in reality customer service departments never seem to have the tools or authority to fix problems. Good customer service reps can be as sincere and sympathetic as they like, but they are often by the reality the customer service departments have the purpose of screening complaints at the lowest cost.

We intend to bring back old fashioned customer service by directly responding to the specifics of every customer query. No canned responses allowed. We have the tools and authority, and we plan to use them to the fullest effect.

This is actually quite selfish. We fully realise the effect of dealing with customers well. A satisfied customer can bring new customers to the table much better than any sales or promotional effort we could possibly undertake.

It is often repeated by marketing types that keeping a customer is far cheaper than getting a new one. But very few organisations have studied the numbers. We did a spreadsheet forecasting subscriber acquisition rates based on varying rates of subscriber signups resulting from the recommendations of existing subscribers.

Being able to deliver good service feels good, but the numbers tell us that having great customer service is the only way to go. We hope other will follow.

MISSING DAILY UPTIME REPORTS FIXED

Random occurrences of missing daily reports have finally been fixed. This took a rather long time to find because of the difficulty of tracing executables running as system services. This was compounded by the fact that the process is scheduled to run only once a day.

TESTING URL PATH SPECIFICATION INSERT FIXED

A few users were able to insert testing url specifications for pages other than the home page. There was also an instance of an ip address being used in the host specification. These anomalies have been remedied in the regular expression that validates the requested url.

DOUBLE INSERTS FIXED

Users who subscribed in the last few days were having duplicate account records inserted into the database. While one of the accounts was dormant, it cascaded into the alert destination insertion routines. This caused every requested alert destination to be inserted in duplicate. The database has been cleaned up and the code error fixed.

MOZILLA BASED BROWSER INCOMPATIBILITY FIXED

Netscape and Firefox users can now do everything from the comfort of their home browser. A minor bug affecting one javascript action link was fixed. An obscure incompatibility, but easy to fix once found.

RESEARCH ON FIVE NINES

There must be a homework assignment due in the next few days on 5 9's uptime. The hits from the search engines are all showing these keywords as the query in the referrers. Well, we hope the information helps. Especially, the uptime calculation table.

BASICSTATE.COM WEBSITE UPTIME MONITOR SERVICE

Well, the main site and the services behind it have launched publicly.

That means that there is a need for a place to interact publicly. A place to write down all the stuff that comes to mind that does not fit into an faq or news release section.

The site launch also means that there is finally time to write without feeling that there is something else that should be done.

This is because the greater part of running a site is not the coding, but taking care of the users.

THE SOFTWARE BEHIND THIS JOURNAL

The software behind the uptime journal is about as simple as it gets. There is none. The page is handwritten in plain text and included in a template. Simple and effective. No software of any kind to install.

It may not be the way to go for anyone else, but for us this seems to be the easiest way to get going.

source: basicstate.com