What HMRC Accepts
The Spring Budget 2023 extended the scope of qualifying expenditure to include data licences and cloud computing services. The change took effect for accounting periods beginning on or after 1 April 2023 and was enacted in Finance (No. 2) Act 2023. The underlying manual treatment sits within CIRD83000.
Two categories of cost are in scope:
- Cloud computing services. Payments for cloud-based compute, storage, and related infrastructure services used in the R&D activity.
- Data licences. Payments for access to datasets used in the R&D activity, subject to certain conditions on the licence terms and use rights.
Both categories follow the usual apportionment rule: only the portion used in qualifying R&D is claimable, with the balance (production, administration, customer-facing hosting) outside the scope.
When the Rules Apply
The rules apply to expenditure incurred in accounting periods beginning on or after 1 April 2023. A company with an accounting period from 1 January to 31 December 2023 does not qualify for cloud cost relief because the period began before the effective date; its cloud cost first becomes claimable in the 1 January to 31 December 2024 period.
A company with a period straddling 1 April 2023 (for example, 1 October 2022 to 30 September 2023) is more nuanced. Under the pre-Finance (No. 2) Act 2023 rules, cloud and data costs were not claimable at all. The post-April 2023 claimable position applies only from the first period beginning on or after 1 April 2023, which for this example is the 1 October 2023 to 30 September 2024 period.
Apportionment
Cloud provider invoices rarely identify R&D workloads directly. Four common apportionment methods:
Tag-based. Cloud resources are tagged as "R&D" or "production" via the cloud provider's tagging feature. Cost allocation reports then break the monthly bill down accordingly. This is the strongest method.
Resource-group-based. Separate resource groups, subscriptions, or projects for R&D versus production. The billing is already split by the cloud provider structure.
Workload-based. Estimation of compute hours attributable to R&D workloads (model training runs, experimental jobs) versus production. Evidence via training job logs, batch job histories, or MLOps platform data.
Headcount-based. Where cloud access is distributed across a team, apportion by the team's R&D percentage. Less precise but sometimes the only viable method for small organisations.
What Is Included
- Cloud compute hours for R&D workloads (model training, experimental batch runs, compilation farms).
- Cloud storage for datasets, code repositories, and artefacts used in R&D.
- Serverless function execution for R&D pipelines.
- Container orchestration resources supporting R&D staging environments.
- Cloud-based development and testing platforms where they support qualifying activity.
- Data licences for training datasets, reference datasets, and data used in scientific research.
What Is NOT Included
- Cloud infrastructure hosting the live customer-facing application.
- Cloud-based commercial SaaS (CRM, HR, accounting).
- Data feeds used for commercial operations rather than R&D.
- Cloud storage for historic archives with no ongoing R&D use.
- Bandwidth and egress attributable to customer traffic.
- Cloud costs incurred in accounting periods beginning before 1 April 2023.
Common Enquiry Risks
- Claiming the entire cloud provider invoice. Apportionment is nearly always needed.
- Including production hosting. Customer-facing workloads are outside the scope.
- Weak apportionment evidence. Flat percentages without cost-allocation reports are hard to defend.
- Including periods before 1 April 2023. The effective date rule is strict.
- Double-counting against software category. Cloud subscriptions straddle both headings.
- Including data licences not used in qualifying R&D. Operational market feeds and commercial BI data are out.
Worked Example (Indicative)
A machine-learning start-up with a 1 April 2025 to 31 March 2026 accounting period incurs total AWS cost of £240,000 across compute, storage, and networking. Resource tagging shows that 55% of the spend supported R&D model training and experimentation, 40% supported the live customer-facing product, and 5% supported internal administrative workloads.
In addition, the company spent £18,000 on a commercial dataset licence used to train its core model.
Claimable cloud cost: £240,000 x 55% = £132,000. Claimable data cost: £18,000 (licence wholly used in R&D training). Total data and cloud contribution: £150,000.
At the 20% merged scheme credit rate, this contributes £30,000 gross credit; net benefit approximately £22,500 after corporation tax at 25%. Under ERIS at 27%, gross credit would be £40,500. Figures are indicative.
Navigating the Major Cloud Providers
The major cloud providers each offer tools to isolate R&D workloads from production and administrative workloads. A quick tour of what works.
AWS. Tag resources with an "rd-project" key and use Cost Allocation Reports plus AWS Cost Explorer to break the monthly bill by tag. Alternatively, use separate AWS accounts within an Organization for R&D and production, giving mechanical billing separation.
Microsoft Azure. Resource groups, subscriptions, and tags provide three layers of separation. Azure Cost Management reports aggregate spend by any of these dimensions. The enterprise enrollment model allows dedicated R&D departments within a shared enrollment.
Google Cloud Platform. Projects provide strong separation. Billing is naturally aggregated by project. Labels on individual resources give a finer-grained view where projects are shared.
The choice of separation strategy depends on organisational structure, but the goal is the same: be able to point HMRC to a clean report that isolates the R&D workloads from non-R&D workloads.
Data Licences: What Qualifies
Data licences are claimable where the data is used directly in qualifying R&D activity and the licence terms permit that use. Practical examples include:
- Training datasets for machine-learning model development (image corpora, text corpora, annotated datasets).
- Reference datasets used in scientific research (chemical compound databases, genomic reference sequences, materials property databases).
- Commercial datasets licensed for R&D purposes (market research data used for algorithm development, geospatial data used in novel analytics).
- Open datasets where there is a recurring access or infrastructure cost.
Data used purely for operational purposes (customer analytics dashboards, live trading feeds, business intelligence reporting) does not qualify. The licence terms should not restrict R&D use; some data licences exclude research use, in which case the cost is outside the rules regardless of how it was actually used.
MLOps and the Training-Inference Boundary
For machine-learning companies, the cloud cost split often tracks the training-inference boundary. Training runs (experimental and production model training) are typically qualifying R&D activity. Inference (serving predictions to customers) is typically production activity.
The boundary is not always sharp. Shadow inference comparing new and current model versions may be R&D. Production inference with logged outputs used for future training may have an R&D portion. A specialist adviser will map the MLOps pipeline against the R&D qualifying-activity test and set defensible apportionment percentages.
Documentation for Data and Cloud Claims
- Cloud provider invoices for the claim period.
- Cost allocation or billing reports showing the split by tag, resource group, project, or account.
- Written apportionment methodology explaining the split between R&D and non-R&D.
- For data licences: the licence agreement, evidence of use in R&D activity, and confirmation that the terms permit R&D use.
- Reconciliation between cumulative cloud provider invoices and the claimed cost.
Bandwidth, Egress, and Network Costs
Cloud bills often include significant bandwidth and egress charges. The claim treatment mirrors the underlying workload: egress attributable to R&D workloads (shifting training data, distributing experimental results, replicating datasets between regions for R&D purposes) is claimable; egress attributable to customer traffic (serving the live application) is not.
Tag-based cost allocation usually captures network costs alongside compute costs. Where the cloud provider does not attribute network costs to specific resources, a defensible allocation key (R&D share of compute hours, or R&D share of storage) can be used as a proxy.
Pre-Paid and Reserved Instances
Pre-paid cloud commitments (AWS Reserved Instances, Azure Reserved VM Instances, GCP Committed Use Discounts) are claimed in the period the underlying services are consumed, not the period the commitment was paid. This matches the normal accruals treatment.
A three-year commitment paid upfront would be amortised over the three years for accounting and for the R&D claim. The R&D apportionment for each period reflects the R&D usage of the reserved capacity in that period.
Where reserved instances have been mis-sized (purchased for R&D workloads but underutilised), the claim is still limited to actual R&D consumption. Unused capacity does not extend the claim.
Serverless and Managed Services
Serverless functions (AWS Lambda, Azure Functions, Google Cloud Functions) and managed services (managed databases, managed Kubernetes, managed messaging) are claimable where they support R&D workloads. The apportionment follows the same R&D-versus-production split as traditional compute.
A development or staging environment used for R&D testing is typically fully claimable. A production database that also holds R&D data (for example, production analytics feeding R&D experiments) is apportioned based on the R&D use share.
Straddling the Effective Date
Companies with accounting periods straddling 1 April 2023 need to be careful. The statutory effective date is 1 April 2023 for accounting periods beginning on or after that date. A 1 October 2022 to 30 September 2023 period does not qualify for cloud cost relief at all; a 1 October 2023 to 30 September 2024 period does, for all the cloud cost in the period.
There is no pro-rating of cloud cost relief within a single accounting period that pre-dates the effective date. Either the whole period qualifies (if it began on or after 1 April 2023) or none of it does (if it began before).
Post-April 2024 Interactions
The April 2024 merged scheme changes did not directly alter the data and cloud cost rules. Both remain claimable on the same basis as under the old SME scheme and RDEC. However, the heightened compliance focus across all R&D claims applies equally to cloud cost apportionment. Claims with weak evidence for the R&D-versus-production split face the same scrutiny as weak staffing or subcontractor positions.
For context on the wider merged-scheme compliance environment, see the merged scheme guide.
Multi-Cloud and Hybrid Cloud Workloads
R&D workloads increasingly span multiple cloud providers or hybrid on-premise and cloud environments. The claim treatment tracks the workload irrespective of where it runs.
An R&D workload that uses AWS for training, Azure for specialist analytics, and on-premise GPUs for inference experimentation generates claim cost across three billing streams. Each is apportioned based on its R&D share and aggregated into the claim. Where on-premise GPUs are involved, the electricity cost sits under consumables rather than data and cloud.
Sandbox Environments and Free Tiers
Cloud free tiers and sandbox credits used for R&D experimentation do not generate claimable cost (there is nothing to claim). Paid usage beyond the free tier is claimable on the usual apportionment basis.
Innovate UK grant-funded cloud credits may have their own treatment depending on the grant terms and whether the underlying cost is borne by the claimant. A specialist adviser will review the position where cloud costs are funded by a grant.
Data Annotation and Curation Services
Machine-learning R&D often depends on annotated datasets. Where the annotation or curation work is paid for as a service, the treatment depends on the relationship with the provider.
A data annotation firm engaged on a fixed-price deliverable basis is usually a subcontractor. The cost is claimable under the subcontractor heading at 65% (unconnected parties) with the UK provision rule applying post-April 2024.
A per-worker annotation team supplied under the claimant's direction is an EPW arrangement and follows the EPW rules.
A data product (access to an annotated dataset under licence) is a data licence and falls under the data and cloud heading.
The routing matters: the same underlying work can attract different rules depending on how it is contracted.
Co-location, Dedicated Hardware, and Private Cloud
Not all R&D compute is public cloud. Some R&D runs on co-located dedicated hardware or on a private cloud. The cost treatment depends on the ownership model.
Owned hardware. Capital cost, outside R&D tax credit rules (but potentially RDA-eligible). Power to run it: consumables. Software licences for the OS and virtualisation: software.
Co-location (rack space rented from a provider). Rental fee for the rack space: typically rent and outside the rules. Power consumed: consumables (if metered) or excluded. Internet connectivity: depends on purpose.
Private cloud on rented hardware. Where hardware is rented on a compute-hour basis similar to public cloud, the cost sits under data and cloud. Where hardware is leased on a fixed monthly basis, the treatment is closer to rent and may be excluded.
Specialist advice is important for hybrid setups.
Related Cost Categories
- Software — for licence-based tools including cloud-delivered SaaS.
- Staffing costs — for engineers operating the cloud workloads.
- Eligible Expenses pillar guide.