Lightcast Service Level Agreement
At Lightcast we use Cloud Services to create and serve our data through our applications. Production applications are typically hosted on a minimum of two hot server instances with load balancing capabilities as well as automated scaling and server recovery. When applicable we make use of serverless technologies such as AWS Lambda for fast scalability under load or AWS managed scaling such as Fargate. Security is managed according to industry best practices, with third party monitoring services including live threat detection.
Service Level Target
Lightcast commits to 24x7 availability of services and data with an individual service uptime of 99.5% monthly.
By availability, we mean that the service can be reached and used to perform its core functionality. In other words, there are no errors rated Priority 1. We monitor each service’s availability and keep uptime records spanning at least 31 days.
Service Level Measurement
Service availability is internally monitored and measured programmatically. Uptime records of at least the trailing 31 days are kept for internal review.
Service level availability will be calculated using the number of days in the applicable calendar month using the formula (note that in this example the month has 30 days):
For example, if there were 30 minutes of unexcused P1 outage (see Incident Response) on September the 5th, and 15 minutes of unexcused P1 outage on September the 8th, the calculation would be as follows:
Service level performance is published on the Lightcast Status Page.
Service Level Remedy
In the event that our service uptime drops below the 99.5% in a given calendar month and the client makes a request for Service Credit within the following calendar month, the client’s sole remedy and our sole and exclusive liability for the service level failure is as follows:
- A Service Credit equal to ten (10) multiplied by the Average Hourly Fee (defined below) multiplied by the number of hours of downtime for the specific product which occurred in that calendar month in excess of four (4) hours. The “Average Hourly Fee” is equal to all fees paid for the applicable service for a given calendar month divided by the number of hours in the applicable calendar month.
- Service Credits are calculated on a calendar-month basis. The total Service Credit during any calendar month may not exceed one-third (1/3) of the fees paid by the client for the applicable service during the same calendar month.
- Any Service Credit due will be credited to the client’s next invoice after the notice has been made, provided that the client’s account is fully paid up, without any outstanding payment issues or disputes (if received within ten (10) days of the end of the then-current month, the Service Credit will appear on the following invoice). No refunds or cash value will be given for unused Service Credits. Service Credits are non-transferable and may not be applied to any other Lightcast service.
- If Lightcast is in breach of any provision of this Service Level Agreement, the client may terminate the service agreement for cause if the breach has not been cured within two (2) weeks of Lightcast receiving written notice of the same.
Functionality Categories
Each Lightcast service delivers different value to its users, and therefore each one has a specific set of core functions. Impairment of those functions can constitute a high priority outage. Lightcast identifies a list of core functions for each of its services, attached at the end of this document, which will be used in conjunction with the criticality matrix to determine Lightcast’s response to an incident.
Lightcast’s uptime guarantee only applies to an application’s core functionality.
Incident Response
Lightcast categorizes incidents according to their severity, how degraded the affected service is due to the incident, and their impact, how many users are affected by the incident. Based on the severity and impact, Lightcast assigns a priority rating (e.g., P1 for Priority 1) to each incident according to the following matrix:
|
Low Impact |
Medium Impact |
High Impact |
Critical |
P1 |
P1 |
P1 |
High |
P2 |
P2 |
P1 |
Medium |
P4 |
P3 |
P2 |
Low |
P4 |
P4 |
P3 |
The priority rating defines the required SLA response and remediation times. Response is defined as the initial acknowledgement of the issue while remediation refers to the completed restoration of the service and resolution of the issue.
Definitions of Criticalities:
- Critical: Issue impacts core functionality or renders the service inaccessible. Data loss or major degradation.
- High: Product as a whole is still accessible, but core features are degraded. A severe problem affecting the customer experience and material features of the service.
- Medium: Customers are still able to access and use the service’s material and core features. Issues only affect certain features of the service or data. A relatively minor problem that affects customer experience without substantially degrading service functionality.
- Low: A minor inconvenience to customers, workaround available. Little-to-no performance degradation. Typically falls within the margin of error for service/data accessibility. Formatting and/or displaying problems that don’t degrade usability.
Definitions of Impact:
- High Impact: More than twenty-five (25) clients affected.
- Medium Impact: Five (5) to twenty-five (25) clients affected.
- Low Impact: Fewer than five (5) clients affected.
Definitions of Priorities and required Time To Response:
- Priority 1 (P1): Incidents that demand immediate attention and resolution. We will post all known incidents on our status page within one (1) hour and will provide regular updates via email and our status page.
Response time SLA: 1 hour
Remediation time SLA: 3.5 hours
- Priority 2 (P2): Incidents that need prompt attention. We will post incidents on our status page within four (4) hours and will provide regular updates via email and our status page.
Response time SLA: 4 hours
Remediation time SLA: 1 day
- Priority 3 (P3): Incidents that require attention from the engineering team and are prioritized over normal work. We will post all known incidents on our status page within one (1) business day. Updates will be provided upon request and upon the resolution of the issue.
Response time SLA: 1 business day
Remediation time SLA: as needed
- Priority 4 (P4): Incidents requiring normal prioritization. We update customers upon request.
Response time SLA: 1 business week
Remediation time SLA: as needed
Note: As defined above, response times define the beginning of Lightcast’s response to the incident. This does not define the timeframe by which we will be required to deliver a remediation to the incident. Remediation time refers to the time by which Lightcast commits to have the issue resolved.
Exclusions
The following are excluded from SLA affecting service availability:
- Scheduled maintenance. (Any non-emergency server maintenance performed on a quarterly basis, with five (5) business days advance notice posted to subscribers to the Status Page during the hours of 9pm - 5am Pacific Time.
- Upstream service provider outages (AWS, DNS, Snowflake etc.)
- Issues related to third-party domain name system (DNS) errors or failures.
- Security incidents and events.
- Verified bugs of any third-party software used by Lightcast in conjunction with Hosted Services.
- Force majeure events, including without limitation natural disasters, governmental or societal actions, and unexpected infrastructure failures.
- Client environment issues affecting connectivity or interfering with Lightcast services, including without limitation, client’s connection to the internet (e.g., problems with a client’s ISP or modem), or any other client software or equipment, client firewall software, hardware or security settings, client's configuration of software (e.g. antivirus or anti-spyware software), or operator error of the client.
Rate Limiting
Lightcast’s services may be subject to a rate limit to ensure that individual users of the services do not create an unreasonable load that threatens the availability of the services and continuity of service for all users. The following products and rate limits are subject to change:
- Skills API: 5 requests per second;
- Titles API: 5 requests per second;
- Companies API: 5 requests per second;
- JPA API: 10 requests per second;
- Profiles API: 10 requests per second;
- Core LMI API: 300 requests per 5 minutes.
Service Rate limits are subject to change.
To determine if a service you are using is subject to such a rate limit, or to ensure the rate limit is up to date, please review the relevant service documentation.
If more requests are needed, contact your Lightcast representative.
Snowflake Credit Limiting
If a client’s Snowflake access is via a reader account—i.e., because the client does
not maintain their own Snowflake subscription—their Direct Database Access via Snowflake is subject to a monthly limit of 200 credits.
The Snowflake credit limit is subject to change.
More credits may be purchased through your Lightcast representative.
Support
Email support: 24 x 7 at customersupport@lightcast.io (note that for 12 hours on Sunday, EST, this email is unmonitored). Errors and outages will be evaluated and responded to based on the Incident Response section of this policy.
Our support teams will respond to emails, questions, and outages as needed. Engineering teams will respond to any software defects or outages, and are assisted by a number of automated tools and notification systems to respond effectively.
Monitoring
Lightcast’s public status page provides historical records of incidents and outages to the public. Clients may opt into receiving alerts for particular services from our status page service via email or Slack.
Lightcast monitors each service for availability and the usability of its core functionality. The service’s availability (also known as uptime) will be checked by automated tests at least hourly. Logs of this uptime checking are available to clients upon request. Our monitoring and logging infrastructure is the source of truth for determining service availability, errors, and compliance with our uptime commitments.
Additionally, Lightcast monitors services for errors and impacts to core functionality using a combination of in house and third party tools. Logs are retained as dictated by Lightcast’s security commitments. This includes a regularly scheduled on call rotation for services.
Business Continuity Planning
A recovery time objective, or RTO, is the amount of time required to recover from a major incident, including without limitation a complete service failure. A recovery point objective, or RPO, is the amount of acceptable data loss measured in time – time between the latest available backup and the incident (in other words, a 30 minute RPO means that in the event of a failure at 04:37, by the end of the RTO the service’s data will be in the state it was at 04:07 or later).
RTO and RPO commitments are subject to the exlusions listed above.
In support of our SLA, Lightcast performs business continuity planning and drills these plans at least twice each year. These plans are designs and tested/drilled to ensure the following RTO and RPO:
RTO: 3.5 hours
RPO: 30 minutes