up.time Version 7.2 Release Notes - July 2013
New Features in Version 7.2
up.time 7.2 includes various new features.
Custom Dashboards
This release introduces fully configurable dashboards, which allow you to present the most relevant information to diverse sets of users.
Using a point-and-click gadget browser, select dashboard components that you can then arrange in one of many layout templates using a simple, drag-and-drop interface. This release includes an extensive array of dashboard layouts, and comes bundled with a starter library of gadgets. Now, administrators and users alike can build role-based, content-rich pages in minutes.
We will continue to refine existing gadgets, and add new ones to enhance the out-of-box experience. However, users can also take advantage of uptime software's community and developer resources to further extend the capabilities of their up.time deployment:
Extend: Head over to The Grid and see how our solutions architects are extending the up.time platform.
Create and Customize: Want to tweak an existing gadget, or build one from scratch? Use the up.time API, as well as helper classes on GitHub.
Share: Participate with the uptime software community and upload your dashboards for peer review and further customization.
Resource Hot Spot Report
The Resource Hot Spot report is a key checkpoint report that allows you to quickly identify servers and network devices across your enterprise that may be having performance issues, so you can immediately start working to identify what may be causing them.
The Resource Hot Spot report helps you answer the following types of questions:
- Which servers and network devices in my infrastructure were the top consumers in various resource usage categories?
- Which physical servers are running short on memory, or are overworked?
- Which VMs' processes need to be shared with another instance?
- Are all network devices correctly configured?
- For resource-strained Elements, is there a configuration issue or is it a resourcing issue?
The resource consumer summaries rank physical and virtual servers as well as network devices in various resource-related categories, allowing you to correlate top-ranking consumers across categories to identify specific problems. The summaries are populated irrespective of whether critical outages occurred, or the infrastructure experienced 100% uptime; this allows you to identify potentially strained resources before they enter warning-level states.
The report is also a valuable investigative tool that helps you quickly focus on the parts of your infrastructure that require troubleshooting. The report can be configured to include full listings of threshold-violating servers and network devices based on key resource-usage metrics such as memory and CPU usage, port throughput caps, or packet-issue counts. The high, low, and average for these metrics are presented, along with historical graphs for offending metrics; these details can help you confirm whether sustained resource strain, or wild swings are being caused by resourcing deficits or configuration errors.
The following are portions of an example Resource Hot Spot report:
Server Uptime Report
The Server Uptime report is a key checkpoint report that provides you with a focused and succinct snapshot of your infrastructure's availability. Report components include overall availability based on a defined uptime threshold, availability by defined interval over the reporting period, as well as tallies for total outages and unique Element outages. To assist with follow-up actions, Elements are listed by outage time and include details that help you determine whether the outage frequency or duration is contributing the most to total downtime. The Server Uptime report helps you answer the following types of questions:
- What is the overall uptime of my entire infrastructure, and am I meeting my availability target?
- What is the overall count of outages and my mean time to repair when there is a failure?
- Which Elements or groups are experiencing the most downtime?
The Server Uptime report is also a key starter report, as it is automatically created and saved for new up.time installations. This daily report provides an hourly breakdown of availability, using a 95% uptime threshold. By default, a PDF version of the report is emailed to the SysAdmin user group.
The following is an example of a Server Uptime report:
DataStore Health Check (UTS-499, UTS-516)
To improve robustness, up.time now performs regular checks to verify that the database is running. This health check is configured through the new datastoreHealthCheck.checkInterval
and datastoreHealthCheck.timeLimit
parameters in uptime.conf
. If the database link is determined to be down, up.time will pause monitoring and email administrators to investigate.
Changes to Existing Features
The following existing features have changed for the current release.
Dashboard Interaction
With the introduction of custom dashboards, the Global Scan panel has been reorganized and renamed Dashboards. Previous Global Scan views (e.g., SLAs, Network, All Services) are now dashboards that are accessed by tab, and the Dashboards panel is now home for all out-of-box and custom dashboards.
On the standard dashboards, you will notice a number of enhancements.
Updated Charting Engine
We have been updating the charting engine behind our diagnostic and reporting tools.
The Quick Snapshot pages, which summarize an Element's performance over the last 24 hours, have all been unified in terms of look and feel. Depending on the Element type, you may find the summary as received a fresh coat of paint:
Quick Snapshots are also now fully interactive, allowing you to zero in on a smaller time slice, and print or export the result:
Additionally, the gauges and dials on the Resource Scan and Network dashboards have also been updated:
Global Scan Charts: Recent Incidents and Current Service Status
The Global Scan status charts have been moved to the top of the dashboard and now refresh in real-time. The old Recent Outages bar chart is now called the Recent Incidents chart, as it reports all non-OK service monitor statuses, including host re-checks.
Column Sorting
Column sorting has been added to the Global Scan, Resource Scan, Network, and All Elements dashboards. Settings are retained on a per-user-session basis.
Performance Improvements
Aside from under-the-hood optimization to improve overall dashboard performance, to facilitate performance for larger environments, Elements on the All Elements dashboard, and service monitors on the All Services dashboard are by default hidden if their status is OK. This behavior can be toggled by clearing the respective Hide OK Elements and Hide OK Service Monitors check boxes.
Service Monitor Selection in Alert and Action Profiles
We are continually optimizing configuration paths, and improving the Web interface. Service monitors and Alert/Action Profile configuration are now mutually configurable. Previously, at configuration, service monitors were associable with an Alert Profile and Action Profile; however, at their respective configuration screens, Action Profiles and Alert Profiles could not be assigned to service monitors. Now you can associate them at either configuration point.
When configuring an Alert Profile or Action Profile, selecting service monitors for assignment uses the more streamlined drag-and-drop interface:
Notification group assignment when configuring an Alert Profile also now uses this updated interface. This interface allows you to enter a search term to narrow the displayed list, and use Ctrl-clicking and Shift-clicking to expedite service monitor selection.
up.time Installers
The up.time installers for both the Windows and Linux platforms have been rewritten. Although improvements are mainly under the hood, the up.time Controller is now integrated with the core up.time installation, simplifying deployment. During installation, all users will need to provide a passphrase to help generate the Controller's self-signed SSL certificate, so that it can be used immediately to extend up.time's capabilities.
Any v7.1 users who installed the up.time Controller from the separate installer will first need to manually uninstall it before upgrading to v7.2.
Uninstalling the up.time Controller on Windows:
- Open the Programs and Features Control Panel
- Select the up.time Controller.
- Click Uninstall.
- Follow the on-screen prompts.
Uninstalling the up.time Controller on Linux:
- Navigate to the
<uptimeInstallDirectory>/controller/uninstaller
directory
The default up.time install directory is/usr/local/uptime/
- Run the
uninstaller.sh
shell script.
ESX-Only VMware vCenter Monitoring
When users are adding a VMware vCenter server to up.time, they can now opt out of collecting data for VMs in the vCenter inventory, and synchronize only the ESX host and cluster inventory. Upgrading up.time users who are already monitoring a VMware vCenter, but similarly do not want to monitor at the VM level, can turn VM data collection off.
Go to the VMware vCenter Element's General Information page, and edit the vSync settings. Deselect the Collect Virtual Machine data check box, then Save. All existing VMware vCenter VMs in your up.time inventory will be deleted as a background task.
Disabling VM data collection will delete all previously imported VMs from your up.time inventory, as well as their historical data and configuration information. Note that this includes VMs that you have detached from the up.time VMware vCenter inventory for standalone monitoring (as outlined in Detaching Ignored VMs).
Logging
up.time's logs have been reorganized and optimized to streamline troubleshooting:
uptime.log
is cleaner, and entries have context IDs that help you correlate with other logs- there is a new
uptime_diagnostics.log
which includes theuptime.log
details as a subset, but has added information for deeper investigation - the output of
uptime_console.log
anduptime_exceptions.log
have been optimized - the
thirdparty.log
has more information about the library related to the error logging.conf
(or whichever file has been defined in theloggingControlFile
inuptime.conf
) is now included in generated problem reports
Implemented Feature Requests
UTS-661 | The Apache server now uses the mod_deflate module by default; HTTP compression is now enabled. |
UT-14940 | The contents of the Application dashboard can now be sorted alphabetically instead of just by current status. |
UT-14812 UT-11648 UT-10232 | The Topology Tree gadget can be used to hierarchically view topological dependencies, and identify the parent of a dependency that is experiencing an outage. |
UT-14731 | Alerts about an application's status now have more details about which component services are failing (i.e., which master and member monitors are in a CRIT and WARN state). |
UT-14695 UT-11872 | When larger scrollable dashboards automatically refresh, the user's placement is not reset, taking them back to the top of the dashboard. |
UT-14684 UT-10397 UT-10119 UT-10082 | Multiple service monitors can be assigned when creating an alert or action profile; multiple alert or action profiles can be assigned to a service monitor. See Service Monitor Selection in Alert and Action Profiles in these Release Notes for more information. |
UT-14642 | Support for Windows 2012 has been added. See Platform Support and Integration Changes in 7.2 in these Release Notes for more information. |
UT-14384 | There is now sorting on all dashboard tables. See Column Sorting in these Release Notes for more information. |
UT-10351 | The Server Uptime report can be used to break down relevant outages by host. |
UT-10270 | The Server Uptime report can be used to give a daily report of the top-10 critical servers. |
UT-10226 | GUI changes have been made to accommodate dashboards that need to load large amounts of data. See Performance Improvements in these Release Notes for more information. |
UT-10081 QWERTY123 | The Server Uptime report can be used to generate an availability report for a specific group of servers. |
Other Changes to Existing Features
UTS-1309 | Deleted VMware objects that are not automatically deleted from up.time by vSync will now have "(deleted)" appended to their display name. |
UTS-1220 | Improved reuse of VMware vCenter connections, reducing continuous log-in and log-out messages in the vCenter logs. |
UTS-1157 | There have been significant performance improvements in managing large service groups. |
UTS-880 | When creating or editing a user profile, you can make a custom dashboard their login page. |
UTS-688 QWERTY123 | Since a host check is integral to status reporting for an Element, you are now prevented from unmonitoring it (by editing the service monitor and disabling the Monitored check box). You must first change the host check for the Element (e.g. from up.time Agent to ping monitor) on the Services > Host Check page, or disable monitoring for the entire Element. |
UTS-680 | When you acknowledge an alert spawned by an External Check monitor, the status of the service monitor will be reset to OK. |
UTS-627 | Support for VMware vCenter 5.1 and ESXi 5.1 has been made official. |
UTS-516 | When a connection to the DataStore is lost, the Data Collector will now attempt to reestablish a connection five minutes before exiting gracefully. |
UTS-515 | Problem reports can now be generated from the command line. |
Platform Support and Integration Changes in 7.2
Visit uptime software’s Knowledge Base for the latest comprehensive listing of currently supported monitoring station, database, and agent platforms. The following summarizes platform support changes for up.time since the previous release.
Monitoring Station
Solaris 10 on sparc64 | |
Windows Server 2008 64-bit | |
Windows Server 2012 on x64 ************************************************** |
Monitoring Station Browser
Due to the rapid release cycle of Chrome and Firefox, the latest version of up.time is fully supported on the latest browser versions available at the time release testing began.
Chrome 18 | |
Chrome 21 | |
Chrome 25 | |
Firefox 11 | |
Firefox 14 | |
Firefox 19 | |
Internet Explorer 8 | |
Internet Explorer 10 |
Monitoring Station DataStore
SQL Server 2008 | |
Oracle 11g ************************************************** |
Agent-Based Monitoring
Windows Server 2003 Standard, Enterprise | |
Windows Server 2012 Foundation, Essentials, Standard ************************************************** |
Agentless Monitoring
IBM pSeries HMC V6R1.3 | |
IBM pSeries HMC V7R3.1.0–3.5.0 | |
VMware ESX and ESXi 3.5, Update 1–5 | |
VMware ESX and ESXi 4.0, Update 1 | |
Windows XP Professional SP3 | |
Windows Server 2012 |
Service Monitors
Exchange 2007 | |
IIS 6 / Server 2003 | |
MySQL 5.0 | |
Oracle 9i | |
Oracle 10g | |
SQL Server 2005 | |
SQL server 2012 | |
WebSphere 7 | |
WebSphere 8.5 | |
WebLogic 12c ************************************************** |
Platform Integration
There are no changes for this release.
Upgrade Notices
Before upgrading, it is important to review and act (if applicable) on the following notices.
VMware vCenter Server Elements
If you have added VMware vCenter Server elements to up.time 6.0 or later, you must contact uptime software support ([email protected]) for additional steps required to prepare for the upgrade process.
Customized Installations
If you are working with a version of up.time that has been customized in any manner beyond the standard installation downloaded from the uptime software Web site, contact uptime software support before performing an upgrade. Some customization steps include the following:
- custom Java heap settings
- verbose logging
- adding
-Djava.security.egd=file///dev/urandom
to command-line invocation - increasing
-XX:MaxPermSize
- fine-tuning garbage collection options such as
-XX:+PrintGCDetails
,-XX:+PrintTenuringDistribution
,-XX:+HeapDumpOnOutOfMemoryError
Platform-Specific Advisories
up.time 7.0 and Windows: If you are using up.time 7.0 on a Windows-based Monitoring Station, you will need to perform various upgrade steps due to up.time's transition to 64-bit components. Refer to the directions outlined in the version 7.1 release notes about Upgrading a Windows Monitoring Station.
up.time 7.0 and Red Hat Enterprise Linux 5.8: By default, up.time 7.0 was not supported on Red Hat Enterprise Linux 5.8 due to up.time's transition to 64-bit components; however, it's possible that you manually added 64-bit libraries to your version-5.8 Monitoring Station to establish compatibility. Despite this, you should not upgrade directly from version 7.0 to 7.2, as you may encounter issues; instead upgrade to vesrion 7.1 first before upgrading to version 7.2.
MySQL Configuration
The default value for version 5.5 of MySQL's max_connections
has changed from 110
to 151
. For this up.time release, this configuration change has been reflected in the my.ini
configuration for the bundled MySQL server.
If you are using MySQL, and have tuned its configuration such that it does not use the default number of maximum connections, note that the upgrade process will change the max_connections
value to 151
.
If you have further tuned your database settings (e.g., connectionPoolMaximum
in uptime.conf
), we recommend you verify all settings are correct after upgrade. The original configuration files are backed up during the upgrade process (e.g., the default Linux directory is /usr/local/uptime/config-backup
).
Resource Allocation
To simplify deployment and to support Custom Dashboards, the up.time Controller is now integrated with the Monitoring Station installation. Because of this change, all 7.2 installations now include additional services which will increase resource usage. As a starting point, our reference testing with a basic custom dashboard implementation showed an additional 1.5 GB of memory was required. How much more you'll need depends on how extensively you use custom dashboards and the up.time API. We recommend you review the CPU and memory presently allocated to up.time and consider increasing them further if you plan on making heavy use of custom dashboards.
Backing up logging.conf
This release includes a revised logging.conf
file that will be installed at the root of the up.time directory. Back up this file before upgrading in order to retain your logging configuration, then make matching changes to the new logging.conf
file.
HMC-Managed pSeries Servers
If you are monitoring pSeries servers that are managed by the Hardware Management Console, we recommend you upgrade your JSch SSL implementation. Contact uptime software support for assistance.
httpContext and localhost
The JavaScript in gadgets will have problems accessing an up.time Web server that has been set as localhost. If the httpContext
parameter in your uptime.conf
has been set to http://localhost:9999
(your port many be different, or you may be accessing up.time via SSL using the https
scheme), you need to change the parameter value from a loopback address to an absolute name to ensure gadgets function correctly.
UI Instances
If your up.time deployment includes a UI instance, it needs to be able to access the auxiliary gadget files on the core Monitoring Station. Doing so allows dashboards to be shared and viewed across both instances. Creating this required link is now outlined in the documentation for creating a UI instance for the first time; however, existing UI instance users need to modify their current up.time instances in order to enable gadget sharing:
- Shut down the Data Collector service on the UI instance.
- Upgrade the core Monitoring Station by following the standard upgrade procedure.
- Ensure you are logged in with a domain account, and that this account has access to the
<installDirectory>/gadgets
directory on the Monitoring Station. - To accommodate sharing user-created gadgets, on the core Monitoring Station system, make the
<installDirectory>/gadgets
directory accessible by the UI instance system.
How you make the/gadgets
directory accessible depends on the Monitoring Station platform:- On Linux, you can use NFS to share the directory on the core Monitoring Station, then mount it on the UI instance
- On Windows, you can use the
mklink
command to create a symbolic link on the UI instance that points to the/gadgets
directory on the core Monitoring Station, such as in the following example:mklink /D "C:\Program Files\uptime software\uptime\gadgets" "\\host\gadgets
"
You will most likely need to modify sharing and security permissions for the directory.
- Upgrade up.time on the UI instance.
- Restart the Data Collector service on the UI instance.
Installing up.time
On the uptime software Support Portal, you will find various documents and articles that will guide you through a first-time installation or upgrade.
Installing for the First Time
A complete, first-time deployment of up.time and its agents is a straightforward process. Refer to the Installation and Quick-Start Guide for complete instructions on performing a first-time installation.
Upgrading from a Previous Version
You can only upgrade directly to up.time 7.2 if your current installed version is version 7.1 or 7.0.
Users who are running version 6.0 or 6.0.1 must first upgrade to 7.0 before upgrading to 7.2. Users who are running version 5.5 or earlier must upgrade to 6.0 or 6.0.1 as a starting point. (Refer to the uptime software Knowledge Base for specific version upgrade paths.) If you are eligible for a direct upgrade path, you can upgrade using the installer for your Monitoring Station’s operating system. The upgrade process installs new features, and does not modify or delete your existing data.
If your current version is older than the version required for a direct upgrade, refer to http://support.uptimesoftware.com/upgrade.php for information on supported upgrade paths. There, you will also find more detailed installation information, including specific upgrade paths.
Before proceeding, ensure you have reviewed all upgrade advisories.
Resolved Issues in 7.2
UTS-1260 | Memory levels required for reporting have been improved by limiting the usage of the SimpleDateFormat class. |
UTS-1191 UT-14938 | When adding vSphere service groups, adding Elements that have the same UUID on the VMware vCenter will no longer cause database issues. |
UTS-1065 | The VMware vCenter's Platform Performance Gatherer service no longer stops monitoring if one of its child ESX hosts or VMs has encountered a fatal issue. The problem host's status is logged, and data aggregation continues for the remaining inventory. |
UTS-987 | An issue has been fixed with the VM Instance Power State service monitor. When assigned to a service group, the status of all the instances is now successfully reported. |
UTS-972 | The Users tab layout now correctly displays users who belong to many user groups. |
UTS-965 | For Monitoring Stations running on Microsoft SQL Server, an issue preventing VMware vCenter Elements from being taken out of maintenance has been resolved. |
UTS-934 | Duplicate output from the Incident Priority Report PDF has been removed. |
UTS-896 | The uptime.pid file now correctly populates on Linux Monitoring Station installations. |
UTS-895 | An inaccuracy in the documentation about declaring auto-discovery subnet ranges has been corrected. |
UTS-861 | PHP timezone accuracy in the Linux Monitoring Station installation process has been improved. |
UTS-858 UTS-709 | An upgrade deadlock scenario on Microsoft SQL Server, where configuration information has been previously corrupted, has been resolved. |
UTS-761 QWERTY123 | ESX server and VM Elements that are monitored as part of a VMware vCenter server's inventory now correctly retain their custom, up.time-specific name when the Sync Host Name option is disabled. Previously, a vSync operation would overwrite the custom name regardless of how the Element was configured. |
UTS-728 | Retry handling on database timeouts has been improved. |
UTS-726 | An issue with auto discovery of HMC-managed pSeries servers has been fixed. |
UTS-720 | Handling of null values in some configuration checkers has been improved. |
UTS-683 | The Windows File Shares (SMB) service monitor can now be configured to use passwords that include spaces. |
UTS-642 UT-14696 UT-14991 | In order to circumvent Inventory Report generation issues, the OS Type for some "Network Device" type Elements in the report will be listed as Unknown if the device does not have full configuration details in the DataStore. This change is only applicable in rare v7.0 upgrade cases where existing, non-SNMP-based, "Node" type Elements were not correctly re-added as "Virtual Node" type Elements, and were automatically upgraded as "Network Device" type Elements instead. |
UTS-443 | An issue has been fixed where the Run Queue Length graph for a server was displaying the incorrect metric (run occupancy length). |
Known Issues
(UTS-608) The generation of a report as XML to Screen may freeze if you are using Internet Explorer. By default, Internet Explorer runs in compatibility-view mode for intranet sites; this view mode interferes with XML-to-screen report generation. To work around this issue, in Internet Explorer, go to Tools > Compatibility View Settings, and deselect the Display intranet sites in Compatibility View check box.
Contacting Support
uptime software delivers responsive customer support that is available to both licensed and demonstration users. uptime software offers user support through the following:
- Documentation
- Knowledge Base articles
- Telephone
+1-416-868-0152 - E-mail
[email protected] - Web site
http://support.uptimesoftware.com
Contacting uptime software
uptime software inc.
555 Richmond Street West,
PO Box 110
Toronto, Ontario
M5V 3B1
Canada
Main Telephone Line: +1-416-868-0152
Main Fax Line: +1-416-868-4867
Copyright © 2013 uptime software inc.
uptime software inc. considers information included in this documentation to be proprietary. Your use of this information is subject to the terms and conditions of the applicable license agreement.