Finetuning Transformers

Bottom Line: : Fine-tuned Transformer models offer councils the ability to customise AI for highly specialised tasks, enabling greater precision in areas like document processing, automated inspections, and public consultation analysis.

By training models on domain-specific data, councils can enhance efficiency, improve service delivery, and reduce administrative burdens—whether through automating social housing applications, expediting planning approvals, or monitoring road conditions.

However, fine-tuning requires significant upfront investment in data preparation, annotation, and infrastructure, which can be a barrier for councils without existing datasets. While the impact potential is high, the feasibility of implementation depends on data availability and technical expertise.

In many cases, collaborating on shared models or leveraging external providers may be a more practical approach. Nonetheless, where applied effectively, fine-tuned Transformers can deliver transformative improvements across social housing, planning, transport, and public health services.

Technical Maturity

Data Requirements

Scalability

Skill Requirements

Regulatory & Ethics

B+

Social Housing
‍

Planning
‍

Transport
‍

Public Health & Social Care

Name: Finneas

Bio: Finneas is like a master craftsman who has spent years training in a specific field. While others might have broad general knowledge, Finneas has been taught to specialise in a particular area. Whether it's legal advice, or form-filling Finneas has been fine-tuned to understand and respond with exceptional accuracy in that domain. They may not search for new information like Ragnar, but his deep expertise makes him incredibly reliable in his specialty.

How they work

The input-to-output mapping for this particular technique can be broken down into 3 separate components that form a pipeline for input processing.

Pre-Trained Models: Fine-tuning begins with an out-of-the-box, pre-trained LLM, which has been trained on a broad corpus of general data.‍
Specialised Dataset Creation: A task-specific dataset is curated, containing domain-relevant input-output pairs that act as examples for the model to follow. This data is annotated and pre-processed to ensure quality and alignment with the desired outcomes.‍
Fine-Tuning Process: The LLM is trained further on this specialised dataset using supervised learning. During this phase, the model learns to prioritise domain-specific patterns, phrases, and contexts. This can also be known as the ‘training’ phase, a process which is shared with other more standard deep learning models, such as neural networks.‍
Validation and Testing: The fine-tuned model is evaluated on a validation set to ensure it generalises well to new inputs. Adjustments may be made to the dataset or training parameters based on performance.
‍Deployment: The fine-tuned model is integrated into the intended workflow, often with additional layers for monitoring and post-processing to ensure robust, reliable outputs

‍

Key Capabilities

Domain Expertise: Fine-tuned models can incorporate highly specific knowledge and terminology, making them invaluable for niche applications such as legal, planning, or housing services.

Improved Accuracy: By tailoring the model to specific tasks, fine-tuning significantly improves accuracy and relevance compared to general-purpose LLMs.

Token Savings: By finetuning the model, there is no longer a need to include examples in the prompt, meaning that the number of input tokens are less and there the running costs of operating the model will be reduced.

Key Limitations

Data Requirements: High-quality, annotated datasets are essential for fine-tuning. The absence of such data can limit the model’s effectiveness.

Computational Resources: Fine-tuning requires significant computational power and expertise, which may present a barrier for smaller organisations.

Cost and Maintenance: The fine-tuning process involves upfront costs and ongoing maintenance to ensure the model remains up-to-date with evolving requirements.

Feasibility Analysis

Technical Maturity

Fine-tuning Transformer models is a well-established technique in AI, with substantial adoption across multiple domains, including NLP, computer vision, and audio analysis. It is widely used in industry and academia, particularly for tasks requiring high precision and domain-specific expertise. Models like BERT, GPT, Vision Transformers, and Wav2Vec have been extensively fine-tuned for various applications, demonstrating their robustness. Because of this, there are wide ranging resources to access in order to aid the development of the model (take this video for example that may be useful as an introduction for data teams)

Data Requirements

Fine-tuning Transformers requires high-quality, domain-specific datasets for effective training. This can include structured, unstructured, or multimodal data depending on the use case (e.g., policy documents for text, annotated planning maps for images, or meeting recordings for audio).

The data must be well-curated, annotated, and prepared, which can involve substantial effort in cleaning and pre-processing. For many councils, while some relevant datasets may already exist, further work will often be needed to ensure the data meets the requirements of fine-tuning.

The necessity for large volumes of data is less pronounced since Transformers will already have a general understanding of most tasks which they acquired in their pre-training process, nevertheless performance still scales with the quality and quantity of available data.

Scalability

Fine-tuned Transformer models can scale effectively in many cases, particularly when deployed on cloud infrastructure optimised for machine learning. However, updating fine-tuned models to accommodate new data or tasks requires re-training or adaptation, potentially disrupting scalability, becoming a more and more resource intensive process as the number of instances of the model grows.

Although fine-tuning itself is resource-intensive, the resulting models are typically efficient in deployment and can handle increased workloads with appropriate infrastructure, much like with prompted LLMs and RAG models.

Skill Requirements

Implementing and maintaining fine-tuned Transformer models traditionally required considerable technical expertise across multiple areas. This included a solid understanding of machine learning workflows and Transformer architectures, as well as experience in preparing datasets for fine-tuning and selecting appropriate hyperparameters.

With the advent of fine-tuning platforms, such as the one provided by OpenAI, the process has been simplified to an easy-to-understand interface with a lower barrier to entry.Once the model is live however, teams must possess the skills to monitor model performance and address challenges such as overfitting or unintended biases. While pre-trained models and open-source tools can streamline some aspects of the process, most councils would need to invest in specialised training for their IT and data teams or depend on external experts for fine-tuning and optimisation.

Regulatory & Ethical Considerations

Fine-tuning enables models to be customised for domain-specific tasks, enhancing fairness, transparency, and alignment with ethical considerations. However, it also introduces certain risks. The use of domain-specific data must comply with GDPR and other regulations, particularly when dealing with sensitive information.

Additionally, biases present in training datasets can propagate into fine-tuned models, making robust governance frameworks essential for monitoring and mitigation, but since the nature of the data the model is trained on is now known, this process becomes possible, whereas in RAG and there is no access to the data the model is trained on.

While fine-tuned models can offer greater transparency compared to general-purpose ones, they will still require clear documentation and thorough validation to meet audit standards. With proper oversight and regular audits, these challenges can be effectively managed.

Impact Analysis

Automated Property Inspections

Vision models can streamline property inspections by detecting issues like mould or structural damage from tenant-submitted images.

They prioritise urgent cases, reduce manual inspections, and support preventative maintenance through pattern detection.

High development costs make this viable only if councils already have large, labelled image datasets.

Automated Housing Application Assessment

Transformers can extract and verify key data from housing applications, cutting processing times from weeks to minutes.

They reduce backlogs, free staff for complex cases, and ensure faster, fairer outcomes—with humans reviewing uncertain cases.

Upfront investment is needed for integration and accuracy, but long-term savings on manual processing are significant.

Opportunities

‍

Automated Property Inspections (Vision Models)

Vision transformers, which process images, offer some potential for social housing teams, particularly in the identification and resolution of property defects. By training models on large datasets of annotated property images—such as examples of mould, structural cracks, water damage, or pest infestations—these tools can automate the visual inspection process, identifying issues with remarkable accuracy. For instance, tenants can upload photos of suspected problems through an app, which the model analyses to detect and categorise defects, assess their severity, and recommend appropriate actions. This pre-inspection capability streamlines workflows by prioritising properties requiring urgent attention, reducing the need for initial manual inspections. Housing teams can address minor issues remotely or escalate more severe problems to specialist contractors, significantly speeding up response times and improving tenant satisfaction.

For scheduled property inspections, image-based models act as an advanced triage system, enhancing inspection efficiency by pre-screening properties based on tenant-submitted images. This pre-check capability allows inspectors to arrive with a prioritised checklist of high-risk or high-priority areas, ensuring their time is used efficiently. Additionally, by identifying patterns of recurring issues or systemic problems within specific buildings or regions, these models provide a data-driven approach to preventative maintenance, reducing the frequency and severity of future defects.

The costs for developing such a model can be extremely high, where existing labeled datasets are not available, which will be the case for the large majority of councils. The majority of the costs will be concentrated in the data-curation part of the process, as this is a highly specialised task. This is thus not a task that is recommended, unless the social housing team has an existing database it can leverage. #

‍

Automated Assessment of Housing Application Documents

Transformers can also be leveraged to automate the processing of social housing applications and supporting documents. Instead of staff manually reviewing paper forms, pay stubs, IDs, and proof of eligibility, a fine-tuned transformer model can read and extract key information from these files. For example, the Columbus Housing Authority in the US sought an AI solution to classify documents, extract data, and check for completeness, automatically approving or rejecting submissions that are missing information (see route-fifty.com). In practice, a vision-language transformer might scan an application form to capture answers (income, household size, etc.), verify that required proofs (like ID or residency documents) are attached, and flag any omissions or inconsistencies for human follow-up. The model can be trained on a sample of past applications and their outcomes to recognise patterns in complex housing needs assessments.

Automating application intake speeds up a traditionally paperwork-heavy process. Applications that are complete and clearly eligible can be processed in minutes rather than weeks, reducing backlogs and wait times for applicants. Staff time spent on data entry and cross-checking is greatly reduced, allowing officers to focus on interviewing high-need cases or making final decisions. Overall efficiency gains can be substantial – a housing authority can off-load routine processing tasks to AI and accelerate voucher/application processing for residents. This use case leaves the decision making process down to the existing team, meaning that risks of bias, which would be extremely problematic, especially with something as crucial to wellbeing as housing, are omitted from the system.

Ultimately, this means tenants get decisions (or placed in housing) faster, and allows housing teams to focus more resources on the decision making process, enabling a fairer, and more op. Importantly, human case workers should remain in the loop to review cases with inconsistencies or with low-certainty to ensure errors are avoided, and that there are clear safeguarding protocols against potential mistakes.

Adapting these models to housing applications will require fine-tuning on local form templates and terminology, which means gathering a few hundred sample documents for training and validation. However, provided the housing team has maintained records of previous planning applications, the documents and the corresponding outputs, the data curation process will be relatively inexpensive. The development effort involves integration with the housing team’s IT systems (document management, CRM) and setting up confidence thresholds so that low-certainty cases fall back to human review.

It is highly likely that the running costs of hosting the fine-tuned LLM will be offset and indeed exceeded by the savings from manual processing hours. The main cost burdens however will come from upfront capital costs, such as ensuring data security, especially with handling of sensitive personal information, as well as, ensuring maximum accuracy, by curating as much training data as possible, and curating a strict evaluation criteria and regimen prior to model release.

Analysis of planning applications

Fine-tuned LLMs and vision models can automate validation steps in planning applications, detecting missing or non-compliant information.

This reduces manual checks, speeds up approvals, and frees staff to focus on complex evaluations.

Shared development across councils or central support is recommended to maximise data, scale, and ROI.

Public consultation analysis

LLMs can summarise and categorise public feedback, turning unstructured input into clear, actionable insights.

Real-time summaries and dashboards improve transparency, responsiveness, and public trust in planning decisions.

With minimal data prep, councils can deploy cost-effective tools that save time and boost engagement.4o

Opportunities

‍

Analysis of planning applications

Much like with social housing applications, fine-tuned transformers can significantly improve the efficiency of processing planning applications by automating key validation steps. For example, fine-tuned LLMs can be leveraged to identify missing information or flag descriptions that contradict local planning regulations. Meanwhile, a vision transformer (ViT) or object detection model can analyse technical drawings—such as site plans, floorplans, and elevation drawings—to check for essential elements like scale bars, north arrows, or required plan types. The Alan Turing Institute has already demonstrated a prototype that uses AI to classify and detect floorplans in planning applications, automatically spotting common errors that often lead to rejection. By catching these issues early, AI can either alert applicants to fix their submissions before they are reviewed by officers or assist staff by highlighting potential problems, reducing the time spent on manual checks.

The impact of this automation on service delivery would be substantial. Currently, each planning application takes between 30 to 60 minutes of manual review, adding up to roughly 250,000 hours of validation work per year across the UK. An AI-driven system could dramatically cut this down by instantly flagging missing details or common errors—many of which account for up to 80% of application rejections (see ATI’s case study). This would reduce backlogs, speed up approvals, and provide applicants with quicker feedback instead of waiting weeks. Additionally, AI frees up planning officers to focus on more complex tasks, such as evaluating the design and policy implications of applications, rather than spending time hunting for missing documents. With fewer applications delayed due to errors, councils could also expect a decrease in public inquiries about application statuses.

Implementing AI validation for planning applications is both feasible and cost-effective, provided the right resources are in place, and that this work is done at the right scale. One key advantage is that local councils already have large archives of past applications, which can be used to train AI models without needing to create new datasets. However, data preparation is necessary, as past applications must be labeled to teach the AI what constitutes a valid or incomplete submission, and given the complexity of the task the more data that is available the implementation-ready the performance of the model will be. For this reason, and given the fact that this issue is one that affects councils collectively, as well as the broader economic benefits of such developments mean that we believe that the implementation of such a system should be considered at a higher level than individual councils. This could be achieved through cross-council collaboration via a consortium or by central government leadership, supported by funding streams such as Innovate UK grants.

Initial trials could involve running the AI alongside human reviewers to test accuracy before full integration. Over time, AI could be embedded directly into online portals, providing real-time feedback to applicants before submission. Although there are upfront costs related to development, IT integration, and ongoing maintenance, the return on investment is strong—reducing staff workload, cutting validation times, and preventing costly delays in the planning process.

Public consultation analysis

LLM-powered analysis of public consultations can transform the way councils process and respond to community feedback, making engagement more efficient, transparent, and data-driven. Public consultations often generate a vast amount of unstructured data in the form of written comments, survey responses, and spoken input from public meetings. Traditionally, analysing this feedback is a slow, manual process where planners must read through hundreds or even thousands of submissions, summarise them in a structured format to extract common themes and concerns. By leveraging the vast amount of historic data, councils can first fine-tuned LLMs to summarise consultations into their usual structure and format, as well as to categorise responses across key predetermined themes—such as traffic concerns, environmental objections, or support for affordable housing—allowing planners to quickly understand public sentiment and focus on the most critical issues. Sentiment analysis can further refine insights by determining whether feedback is generally supportive, neutral, or opposed to a proposal. This technology ensures that decision-makers receive structured, digestible insights shortly after consultations close, significantly reducing the lag between public input and policy adjustments.

Many residents do not engage in planning consultations because they perceive the process as opaque or feel their input will not make a difference. A well-designed online platform, where residents can submit feedback in free text or structured forms, allows AI to process responses in real time, grouping similar comments and generating live summaries. This immediate visibility reassures the public that their voices are being heard and prevents concerns from disappearing into bureaucratic backlogs.

Furthermore, all this analysis can then feed into a live dashboard, enabling councils to respond more dynamically to emerging issues. For example, if early feedback highlights overwhelming opposition to a specific development aspect, planners can adjust their communication strategy, issue clarifications, or modify proposals before formal objections escalate. Publishing AI-generated summaries of consultations in an interactive format can enhance trust and can encourage greater engagement, provided that feedback is personalised, and can offer glimpses into the impact it is having on decision makers. The strength of this communication back to the public will be key to adoption, and a great deal of care must be taken to design this feedback mechanism.

There is a clear ROI pathway for this use case, saving costs by significantly reducing processing time for consultations, as well as the improvement of the public’s experience with the process, which may also lead to higher levels of engagement. With the vast amount of past consultation, summarisation data councils are likely to hold, the data curation process will be relatively frictionless, and will enable the outputs of the model to match the nuances of the desired outputs of each individual council. For categorisation, it is less likely that historic consultations will be labelled with key themes, and even if this is not so, it is a one-time exercise that will bring lasting benefits to the planning process.. Integration into existing digital consultation platforms will bring additional costs, though this can be done incrementally—starting with AI-assisted analysis as a standalone tool and later embedding it into real-time engagement portals. Ultimately, the savings in staff time, improved responsiveness, and enhanced public trust make fine-tuned LLM-driven consultation analysis a cost-effective and scalable solution for planning teams.

‍

Automated Road Inspections

Vision models turn routine driving into continuous road monitoring, detecting defects like potholes in real time.

They reduce emergency repairs and enable predictive maintenance, improving efficiency and cutting long-term costs.

High setup costs and infrastructure demands may be mitigated by procuring services from specialised providers.

Intelligent Traffic Monitoring

AI vision models track cyclists and pedestrians, providing real-time data for safer, more targeted infrastructure planning.

Near-miss detection supports proactive interventions and evidence-based investment in cycling and walking facilities.

Using existing cameras keeps costs low, though full deployment may require external partners and strong public communication.

Opportunities

‍

Automated road inspections

Road inspection using vision transformers can significantly enhance the way local authorities monitor and maintain road infrastructure. Traditionally, road inspections are carried out manually by surveyors walking or driving along roads to identify defects such as potholes, cracks, faded road markings, and damaged signs. However, vision transformer models can now automate this process by analysing images and video collected from vehicles equipped with cameras. Councils can mount cameras on refuse collection lorries, inspection vehicles, or even staff cars, allowing AI to assess road surfaces in real time as these vehicles go about their normal routes. For example, Surrey County Council has begun using dashboard-mounted cameras paired with AI to automatically detect and log potholes, missing signs, and overgrown vegetation. The AI model, trained on thousands of road images, flags potential defects and records their GPS location, creating a digital record that can be fed into an asset management system. Over time, this approach enables predictive maintenance—by comparing recent images with historical data, the system can anticipate when cracks may develop into potholes, ensuring repairs are carried out before damage worsens. Effectively, this transforms routine driving into a continuous road inspection process, reducing the reliance on labour-intensive surveys and allowing for more proactive maintenance strategies.

By automating defect detection, councils can ensure that problems are caught earlier, preventing small cracks from turning into costly potholes. This results in fewer emergency repairs, which are typically more expensive and disruptive than planned maintenance. AI-generated road condition ratings also enable councils to prioritise repairs more effectively, directing resources to the most deteriorated or at-risk areas first. Cities that have adopted AI-based road assessments have reported major cost savings—one US city saved over $80,000 by switching to AI monitoring , which allowed funds to be redirected towards preventative repairs.

Many councils already have vehicles in service that can be fitted with low-cost cameras, including smartphones, rather than requiring expensive specialist equipment. Some councils, such as Surrey, have opted to buy AI road monitoring as a service from a commercial partner rather than build their own system, but initial trials can use zero-shot classification models, such as Google's SigLIP model, which can then be fine-tuned using council-specific data. While the long-term savings from reducing the need for manual inspections, reducing compensation claims for potholes, and reducing repair costs, there are a number of costs to consider - from hardware, to data curation and tagging, to cloud storage of image data, to model training and evaluation. This introduces a lot of uncertainty and scope for spiralling costs in implementing such a solution, so it may be preferable to procure this as a service from existing companies, with definitive costs, enabling more definitive risk management.

Furthermore, most local authorities already use geographic information systems (GIS) for infrastructure management, meaning AI-generated road condition data can be seamlessly integrated into existing workflows. As more data is collected, the AI models improve in accuracy, learning the characteristics of local roads and further enhancing cost-effectiveness over time. Overall, AI-powered road monitoring offers a practical, affordable, and scalable solution that helps councils maintain roads more efficiently while improving safety and public satisfaction.

‍

Intelligent Traffic Monitoring

AI-driven vision transformers provide councils with real-time data on walking and cycling, enabling smarter infrastructure planning. Using existing traffic cameras or dedicated sensors, these models detect and classify vulnerable road users (VRUs) such as cyclists and pedestrians, continuously tracking movement patterns, speeds, and near-miss incidents. Systems like VivaCity cameras already achieve 95–97% accuracy in road user classification, offering councils a data-rich alternative to manual surveys. In the West Midlands, the council used VivaCity’s services to identify collision risks in real time, allowing authorities to act on near-miss hotspots before serious accidents occurred. This continuous monitoring replaces reactive reporting with objective, high-frequency data, supporting more evidence-based transport planning.

Data-driven insights enable proactive safety improvements and better infrastructure investment. Near-miss data helps councils identify high-risk locations and intervene before accidents occur. Accurate, real-time counts of cyclists and pedestrians also strengthen the case for new cycle lanes and pedestrian crossings, ensuring resources are allocated where demand is highest.

Monitoring can be implemented cost-effectively using existing traffic cameras with software upgrades, as seen in the West Midlands. Leveraging pre-trained models for object detection, the council has the opportunity to develop proof-of-concepts in house, adapting them to a limited set of fine-tuning data at a relatively low cost. However, the full implementation of such a system will require complex cloud and IT infrastructure to feed camera data to the model, and then to process and store labelled data. For this reason, it may be more cost effective to commission companies such as VivaCity when needed, especially if data collection is not needed on a permanent, rolling-basis, but rather on an intermittent basis, when new infrastructure is being considered. Finally, it is worth mentioning that public engagement is key—clear communication ensures residents understand that monitoring is for safety, not surveillance.

‍

General & Personalised Health Advice

Fine-tuned language models can deliver safe, tailored health advice at scale, improving engagement and service efficiency.

They help answer routine queries, boost public health literacy, and support hard-to-reach groups with culturally sensitive messaging.

Development costs vary by complexity; funding partnerships are advised for personalised or multilingual systems.

Transcription & Medical Note Taking

AI transcription can streamline social care documentation, improving accuracy and freeing staff time.

Fine-tuned models structure complex information into professional formats, reducing risks in care decision-making.

Due to high setup costs, councils are advised to wait for mature market solutions rather than build their own.

Opportunities

‍

General & personalised health advice

Finetuned language models can enable public health teams to provide personalised health advice and answer public queries at scale. Where RAG and prompted models couldn’t provide the level of specialisation and accuracy required from an application that rightfully has such a low tolerance to error. Chatbots can be fine-tuned with official guidelines and local service information, ensuring tailored advice for residents, whilst ensuring that advice stays within a safe and low-risk level with users being redirected towards medical services when exceeding a specified level of urgency.

Chatbots can help improve public engagement, health literacy, and service efficiency through interactive, conversational responses, making health information more accessible and user-friendly. This can potentially lift a major load off of health services by dealing with routine queries, leaving the more complicated cases to healthcare professionals. For example, young people who may be reluctant to ask a human about sexual health might freely interact with an anonymous AI assistant to get factual advice, potentially reducing risky behaviours. In mental health, preliminary evidence shows AI “virtual therapists” can help individuals who might never go to an in-person councellor, thus extending support to hard-to-reach populations (see local.gov.uk). On the public health front, personalised reminders and nudges (like tailored diet tips or medication reminders) have been shown to improve adherence to healthy behaviours. At the community level, having multilingual, culturally sensitive AI messaging ensures everyone receives vital information in a way they understand – potentially boosting things like vaccine uptake among diverse groups.

The cost for developing such a system will rise with the complexities of the outputs required from it, for example, if only a restricted set of general advice is needed to be given, then the costs will be manageable, requiring only a moderate size dataset to be curated. However if this is to be extended towards personalised health advice and reminders, then the amount of data required will increase, and the needs for safeguarding outputs, will rise in complexity, and with it costs. For more complex systems, it’s recommended that development costs are captured from funding streams, allowing for cross-sectoral collaboration and alleviating financial risk.

‍

Transcription and medical note taking

AI-driven transcription and case note automation in social care rely on the ability of language models to accurately capture and structure critical domain-specific information from spoken consultations. While general-purpose transcription models can be used with minimal setup through prompting, fine-tuning these models on specialised data will significantly improve their accuracy and reliability in capturing the complex medical and social care information. Fine-tuning will involve training a model on historical case notes, and real-world conversations between care professionals and clients, allowing it to learn the specific terminology, phrasing, and documentation formats used in social care.

A general-purpose transcription model may accurately transcribe speech but struggle to structure information according to the required format of a social work report. With fine-tuning, the model learns to differentiate between sections such as background context, assessment findings, and recommended actions, ensuring that outputs align with professional documentation standards. This level of accuracy is particularly important when summarising medical conditions, legal considerations, and care recommendations, where even minor misinterpretations could lead to misinformed decision-making.

However, the primary challenge of fine-tuning lies in the resource investment required to curate and prepare training data. Unlike prompting, which relies on an existing model with broad general knowledge, fine-tuning demands a dataset of high-quality, structured case notes and transcribed consultations. These datasets must be annotated, anonymised, and validated to ensure they accurately reflect real-world social care documentation. This process requires collaboration between AI specialists and domain experts, as well as compliance with data protection regulations. The upfront costs of data collection, annotation, and model training can be significant, particularly for councils with limited digital infrastructure. Additionally, fine-tuned models may require periodic retraining to incorporate new documentation standards, policy changes, or evolving best practices in care provision. As a result, we recommend that public health and social care teams wait until transcription models become readily available on the market, rather than undertaking development themselves.

‍