Prompt Engineering

Bottom Line: Prompt engineering enables organisations to harness pre-trained Large Language Models (LLMs) for tasks like data cleaning, text summarisation, and sentiment analysis without costly retraining. These models enhance efficiency, scalability, and cost-effectiveness, reducing reliance on external consultants by streamlining decision-making.

However, they come with limitations—hallucinations, ethical concerns, and scaling costs require careful governance, particularly in high-stakes areas like public health and social care.

While LLMs can significantly improve social housing, planning, and resident engagement, their accuracy constraints mean they are best suited for structured tasks where occasional errors are manageable. Ultimately, they offer a transformative yet controlled approach to AI integration across various sectors.

A+

Technical Maturity

Data Requirements

Scalability

Skill Requirements

Regulatory & Ethics

A+

Social Housing
‍

Planning
‍

B-

Transport
‍

Public Health & Social Care

Name: Prompta

Bio: Prompta is like an expert assistant who knows a lot about everything. If you ask Prompta a question, she’ll give you an answer based on what she already knows, which is a huge amount of general knowledge. Unlike a librarian who can search through custom books or sources, Prompta relies entirely on her own well-trained mind to come up with helpful responses.

How they work

The input-to-output mapping for this particular technique can be broken down into 3 separate components that form a pipeline for input processing.

Pre-trained Models: General-purpose LLMs are trained on extensive corpora of text data, learning patterns, language structures, and factual knowledge. This pre-training phase enables the model to acquire a broad understanding of language and context without task-specific data. There are no resource requirements for this component, and the model will be used out-of-the-box.
Prompt Design: To utilise these models for specific tasks, users provide prompts—input text that guides the model's response. A well-designed prompt sets the stage for the desired output by framing the context and instructions clearly. This technique is powerful because it doesn’t require retraining the model; instead, it harnesses the model's existing knowledge. It can be seen as a means of focusing the model’s attention, and setting its boundaries, such that output will be relevant and a consistent format, making the subsequent stage of processing as simple and as least resource intensive as possible. Prompt design will include few-shot learning which involves providing a few examples within the prompt to further guide the model’s response.
Output Processing and Integration: Once the prompt is fed into the LLM, the model generates a response based on its learned patterns and context from the input. The output should then be processed into the correct format and stored or used for the downstream task accordingly. The output may not always conform to the format you have specified in your prompt (the frequency of these inconsistencies is of course dependent on how complex this format is), and therefore there is a need to build data governance/processing layers on top of raw outputs from the LLM to ensure the process it is being used for is able to run smoothly.

‍

Key Capabilities

Versatility: These models can perform various tasks, including text summarisation, translation, content generation, sentiment analysis, and answering questions, depending on the prompt's design.

Zero-shot and Few-shot Learning: General-purpose LLMs can often generate accurate outputs with little or no task-specific examples. Zero-shot learning involves asking the model to perform a task directly from a prompt, while few-shot learning includes providing a few examples within the prompt to guide the response.

Contextual Understanding: LLMs excel in maintaining context over long passages, enabling them to generate coherent and contextually appropriate responses even for complex queries.

Key Limitations

Generality: The generality of pre-trained LLMs is in itself a limitation. While they are trained on vast datasets, they may lack deep expertise in niche or highly specialised fields. This can lead to more incomplete or incorrect responses, the more specialised the task, the greater the likelihood of this.

Hallucinations and Fabrications: LLMs sometimes generate information that is factually incorrect or entirely fabricated, known as "hallucinations." This is particularly concerning in high-stakes applications where accuracy is critical, such as legal, medical, or governmental contexts. An LLM will always respond to a query, but guaranteeing it is grounded in truth requires complex and expensive data governance.

Feasibility Analysis

Technical Maturity

General-purpose LLMs are highly stable, thoroughly tested, and widely adopted across various domains, including natural language processing, customer support, and content generation. Their robustness is supported by extensive research, strong community engagement, and continuous advancements from leading AI labs, ensuring reliability and ongoing development.

A+

Data Requirements

The LLMs used for this technique are pre-trained on a general and immense corpora, this training process is assumed and is outside the scope of the technique. Instead, a small set of input-output examples can be provided to the LLM as context to drive more accurate, relevant, and consistent results.

A+

Scalability

The large majority of scenarios will be using API connections to LLMs through platforms such as Microsoft’s Azure. Cloud platforms have significant capacity to handle traffic, therefore the effect on user-experience as the product expands will be negligible, however, costs will scale linearly, and thus the council will not necessarily benefit from typical economies of scale.

Skill Requirements

The creation of effective prompts and the curation of optimal examples may necessitate the input of prompt engineering experts. However, for the majority of tasks, prompting will only require a comprehensive understanding of the underlying problem, the constraints on the output, and the necessary approach to achieve the desired outcome. This is a role that is well suited to existing council employees who are intimately familiar with the problem and the model's requirements on a daily basis.

Regulatory & Ethical Considerations

Impact Analysis

Data Cleaning & Processing

LLMs excel at categorising, summarising, and cleaning vast datasets, unlocking hidden insights.

Example: DG Cities processed 600,000 repair records in Greenwich for under £5, enabling smarter retrofit planning.

Efficient, cost-effective, and scalable for housing providers.

Data Querying

AI-driven text-to-SQL removes barriers to complex data analysis, enabling faster, cost-saving decision-making.

Councils can self-serve insights on housing stock without costly external consultants.

High-complexity models like GPT’s o1 are viable for internal use, keeping costs minimal.

Chatbots

AI chatbots streamline tenant queries, reducing pressure on housing teams and improving response times.

Multilingual support ensures accessibility.

‍Challenges: High development costs, limited accuracy for nuanced queries, and scalability concerns due to increased traffic.

Sentiment Analysis

AI can categorise complaints and map resident sentiment to pinpoint service gaps.

Cheap to build and host, but user engagement is key for adoption.

Can proactively enhance community relations by identifying pain points early.

Impact Analysis

Technical Maturity

As seen across the discussed use cases, there is significant potential for Prompted Large Language models to lay the foundation for comprehensive data-informed decision making around the various housing services that councils provide. Through this, we have identified opportunities for the more efficient allocation of capital and resources, closer communication with residents, and greater organisational knowledge on the condition of their housing stock.

There are limits to the accuracy, depth and complexity of these models, as discussed in the use of them for chatbots and resident communication, and they should only be considered when the consequences of error are relatively low. With that being considered, the impact on the service delivery in social housing can be transformative.

A+

Data Requirements

Compared to other AI techniques that demand intricate model design, extensive data collection, rigorous training, and ongoing maintenance, prompted models offer a more streamlined and cost-effective approach to development. Their minimal upfront requirements make them highly accessible and efficient for councils aiming to implement AI solutions without substantial initial investments.

However, operational costs for prompted models can be higher than those for non-Transformer-based alternatives due to their larger size and computational demands. These costs scale linearly with usage volume, making them dependent on the specific application. Despite this, given that councils operate within localised contexts rather than catering to global markets, the anticipated traffic levels are manageable, ensuring that running costs remain moderate at most.

While we have not identified direct revenue generation opportunities—the housing team works in a revenue generating department, meaning that more informed decision making will lead to higher revenue, the potential of which can be significant, but still uncertain. By reducing the human resource burden associated with recording and analysing resident and property data, these solutions can free up valuable resources for strategic decision-making and budgeting. This shift could significantly reduce reliance on external consultants, enabling more autonomous and cost-effective internal operations.

Implementing these solutions will require engagement from both employees and residents. Therefore, investment in ensuring awareness, training, and adoption will be essential to fully realise their potential. While these engagement efforts entail moderate additional costs, they are crucial for maximising uptake and ensuring the solutions deliver their intended impact.

A+

Opportunities

Data Cleaning & Processing

As with any dataset—not just those related to housing—LLMs excel at tasks such as text categorisation, summarisation, and data cleaning. These capabilities can unlock valuable insights, particularly when manual data processing isn’t feasible due to volume or complexity (e.g., word matching, punctuation correction).

For example, as part of their Home by Home plan, DG Cities successfully categorised repair descriptions for all properties within the Royal Borough of Greenwich for the past 3 years. This process enabled us to prioritise retrofit efforts and strategically allocate properties to different retrofit schemes by combining repair data with stock condition surveys. By cleaning and processing this historical data, we have been able to unlock a deeper understanding of housing assets, revealing new levels of actionable detail.

Notably, processing 600,000 repair instances incurred a cost of under £5 for LLM usage, with only moderate expenses associated with model development. This highlights the efficiency and cost-effectiveness of leveraging LLMs for large-scale data processing.

Data Querying

Once data is cleaned and processed, the following layer of analysis There are many mature frameworks for the use of prompted LLMs in text-to-SQL tasks, allowing for social housing providers to draw relevant data from large, complex datasets with natural language. This will impact providers both in terms of service delivery and cost savings.

By expanding the accessibility of data, councils can make more data-driven decisions on resource allocation, with the text-to-SQL acting as a data retrieval assistant. Furthermore, text-to-sql models can be combined with more powerful reasoning models such as Chat GPTs o1 in order to provide data-driven answers to complex and nuanced queries about the council’s housing stock, and in this case, the impact on service delivery can be transformative. With the barrier-to-entry for interacting with data essentially being eliminated with prompted models, the time-to-answer is significantly reduced, and it will ultimately reduce the need for outside experts to be brought in to perform analysis, meaning the cost savings can be significant.

Since the models are being used internally, even if more expensive models such as GPT’s o1 are being used to tackle more complex queries, the volume of requests to models will be far smaller than if it were to be a service provided to tenants, and therefore the cost burden for hosting this service is minimal. The model will require a simple interface for residents to interact with, and will incur the initial costs of developing the model and pipeline, which due to its slightly more complex nature will be larger than other more typical prompted models. Nevertheless, with no finetuning, or training procedures, as well as no need to research and design model architecture, the relative capital expenditures are quite modest.

Chatbots & Tenant communication

These tools can provide immediate responses to routine inquiries, allowing human staff to focus on more complex issues, this will ultimately allow for tenants to receive quicker response times to queries, as well as reducing resource loads on housing teams. Furthermore, translation, which Large Language Models perform well in, especially in more general contexts, can be integrated into the chatbots to ensure communication and support can be provided for tenants from all backgrounds.

To ensure that responses are robust and adhere to council policies mean for more sophisticated and careful design of prompted large language models, with the need for multiple agents each with their own prompt to handle different types of requests, meaning that upfront costs for development can be significant. In this case, chatbots can only be used for a restricted set of possible queries, meaning it is likely to perform poorly for more nuanced queries. Since this will be a service provided to tenants, the overall traffic to the model will also increase, bringing up running costs of providing this service. Finally, with limited means to include examples for the model to follow, ensuring that answers are accurate, and adhering to pre-existing guidelines, cannot be guaranteed, and in applications where accuracy is key, a prompt-only based solution may be unsatisfactory.

Data Cleaning & Processing

LLMs can standardise and structure unstructured planning data, improving decision-making and reducing manual work.

Combined with querying tools, they provide planners with rapid insights into trends, compliance, and land use.

Cost-effective and efficient, they support smarter, more autonomous planning workflows.

Document Summarisation

Prompted LLMs can summarise lengthy legislative and planning documents, speeding up access to key insights.

OCR tools paired with LLMs unlock value from non-digital archives, aiding continuity in planning decisions.

Since processing is one-off, costs remain low—though RAG offers a more robust, scalable solution.

Sentiment Analysis

LLMs can analyse consultation feedback to identify public sentiment, key themes, and emerging concerns.

While few-shot learning offers partial grounding, deeper accuracy would require fine-tuning beyond this technique.

Used internally, these tools are low-cost and reduce reliance on external analysts.

Impact Analysis

Service Delivery

Prompted LLMs provide significant opportunities to enhance service delivery within planning departments. Their ability to summarise complex legislative, historical, and planning documents enables quicker and more informed decision-making, one that is coherent and consistent within outlined goals and visions set by the council. All of this considered, once the summarisation is done, there is still a need to manually index and search for these documents, with no real potential to build a system that can automatically retrieve documents based on specific internal needs.

Additionally, sentiment analysis of consultation feedback allows councils to swiftly interpret public responses to planning proposals. This accelerates the feedback loop between residents and planning authorities, ensuring that community input is systematically considered in planning outcomes. The efficiency gained in processing and analysing consultation data can reduce delays in project approvals and enhance community trust in council decisions.

Data cleaning and querying capabilities can also offer limited but transformative benefits, as seen with the other impact areas. By structuring and standardising unstructured planning data (such as application records and zoning information), councils can identify trends and ensure compliance more effectively. Despite the potential for hallucination and inaccuracy, the identified use cases exhibit sufficient tolerance for such deviations, and validating their use.

Costs, Revenues & Savings

Prompted LLMs are a cost-effective solution for planning tasks, requiring minimal upfront investment due to the absence of bespoke training or model fine-tuning. Tasks like document summarisation and sentiment analysis incur low recurring costs, as they are primarily used internally, and processing can be done one time per document, and stored, transforming it from minor running cost to a negligible upfront cost.

The savings potential is moderate—reducing reliance on external consultants for data analysis and enabling staff to focus on higher-value tasks. The processing of non-digital data manually is eliminated by LLMs, essentially eliminating the cost associated with that task. Finally, sentiment analysis reduces the man-hours spent on processing, categorising and conglomerating data from large consultations. Aside from data querying, these are one time cost savings. They will likely cover and potentially exceed implementation costs, but offer no significant financial gains.

Opportunities

‍

Document Summarisation

Document summarisation using prompted LLMs offers an efficient solution for managing the vast array of legislative, historical, and planning documents—often held in non-digital formats—that underpin a council's vision for a locality. By swiftly condensing lengthy documents into key points, LLMs enable planning teams to access critical information without having to manually review each source. This accelerates decision-making processes and ensures that both overarching strategies and micro-level details are accounted for.

Fintuned LLMs for optical character recognition (OCR) which is able to digitise non-digital records, already populate the market. This can unlock valuable insights from archives, helping councils track precedents and maintain continuity in planning decisions. Since there are a finite number of these documents that need to be processed, this summarisation can be done once and then stored offline, minimising the upfront and running costs of the model. Although unlikely and suboptimal, if the resulting summarisations are concise enough, and can all be fit within the context window of the LLM, then all historic documents can be queried, furthering the level of support provided. Nevertheless Retrieval Augmented Generation (RAG), or AI technique 2 is better suited to this task and offers a more robust solution.

‍

Sentiment Analysis

Prompted LLMs can efficiently categorise and summarise the wide range of ideas, concerns, and questions submitted during large-scale consultations. This enhances councils' ability to gauge public sentiment on planning initiatives, ensuring that decision-making is informed by a comprehensive analysis of resident perspectives. Although categorisation can be somewhat grounded through few-shot learning (examples), the effect of this can be limited on diverse data that can’t necessarily be captured in a small number of examples. To completely ground responses in truth, the only other alternative would be to finetune the model on the council’s own database, but is outside the scope of this technique and instead captured in AI Technique 3.

By making the interpretation of consultation feedback more accessible, councils can identify key themes and emerging issues more quickly and accurately. This significantly reduces the time required to process feedback and lowers the reliance on external analysts, leading to notable cost savings.

Since these models are used internally to assist planning teams, the volume of requests is far smaller compared to a public-facing service. Consequently, the cost burden of running more sophisticated models remains minimal. Implementing such a system would require an interface for planners to interact with the analysis and some upfront development costs to establish the model pipeline. However, given the absence of fine-tuning, bespoke training, or model architecture design, the capital expenditure remains relatively modest.

Data Processing, Cleaning, & Querying

The cross-cutting impacts of the use of prompted LLMs as a data prepping tool apply to Planning teams as well. For example, data from planning applications, land use surveys, or zoning records often come in unstructured formats. With LLMs, this data can be standardised, categorised and structured into tables, providing a cohesive overview of land and development activity across the council’s jurisdiction. Coupled with advanced querying capabilities, LLMs enable planning teams to extract data to identify trends, assess compliance with local plans, and make data-driven decisions efficiently—reducing reliance on external consultants and improving overall planning outcomes.

‍

B-

Data Cleaning, Processing & Querying

Prompted LLMs can clean and standardise transport data, improving its usability and supporting internal decision-making.

Paired with querying tools, councils can access accurate insights without relying on external consultants.

Especially useful for manual datasets like infrastructure surveys, boosting operational efficiency.

Sentiment Analysis

Sentiment analysis of feedback helps identify pain points in the transport network.

Benefits mirror those in social housing—capturing user sentiment to guide improvements.

Transcription & Summarisation

Prompted LLMs can streamline social care documentation by transcribing visits, summarising key points, and filling reports.

MVPs are affordable using existing APIs, but human oversight is needed to assess accuracy before full integration.

Sensitive data handling requires strict GDPR compliance, secure storage, and careful cost-benefit analysis.

Translation

Speech-to-text paired with LLMs enables real-time translation during care visits, enhancing communication.

Out-of-the-box solutions offer rapid, low-cost deployment, improving service accessibility for non-English speakers.

Data Cleaning & Querying

LLMs can clean and structure health and social care data, supporting personalised care and cross-sector analysis.
‍
Combined with socioeconomic data, this helps identify and mitigate health inequalities more effectively.

Chatbots & Sentiment Analysis

Chatbots offer scalable support for routine health queries and patient check-ins, improving engagement and continuity of care.

Sentiment analysis can flag risks, but accuracy limitations require cautious implementation and restricted output scopes.

Impact Analysis

Service Delivery

Since much of the work conducted in public health and social services involves written or verbal communication, the applications of prompted LLMs are pertinent and wide-reaching. We have explored several use cases where service delivery will be impacted. Transcription and automated note-taking can transform the service delivered to patients/residents, but the complexity of the system and its associated accuracy can act as resistance to the impact on service delivery.

Translation tools have been identified as a quick and immediate win for councils, and can have significant impact for minorities whose health outcomes can be impacted as a result of communication barriers. As with other impact areas, data cleaning and querying can have cross-cutting impacts on service delivery, and can be seen as an immediate win, if it can achieve acceptable levels of accuracy.

Finally, the potential impact of chatbots is huge, but it is unlikely that solely prompted chatbots can fulfil that full potential given its accuracy limitations, and the accuracy that medical and social care systems require. There is however still potential to use prompted LLMs to handle low-risk, low-complexity queries with sufficient accuracy through guiding and restricting outputs with an agentic approach.

Costs, Revenues & Savings

The development and running costs for LLM-based solutions can be relatively low, especially since many of the foundational components (speech-to-text APIs, translation models) are available out-of-the-box. Initial deployment costs for MVPs are minimal.

However, operational expenses, particularly for high-throughput services like transcription of long audio files or real-time chatbots, may rise significantly. For chatbot services the upfront development costs will be significantly larger than other application areas as there is a need to restrict and guide outputs as much as possible, which will be done through careful system design.

Additionally, compliance with GDPR and ensuring secure data handling adds to costs, especially given the sensitivity of this impact area. All things considered, these upfront costs will be relatively low in comparison with models that will need to go through the resource intensive data collection, curation and training cycles.

The cost-saving potential of these systems is through the reduced costs of data curation and processing and reduced translation costs.. Despite this, given the critical nature of this impact area, it is imperative that service delivery is prioritised, and therefore the impacts on cost-savings are relatively limited.

Opportunities

‍

Transcription, Summarisation & Report Filling

In health consultations and social care visits, ensuring that patient concerns and problems are accurately recorded can be difficult, and often key information from visits can go missing. Maintaining consistent detail from consultations and visits can be integral to optimising and tailoring care to the needs of each individual. Most API providers already offer speech-to-text services, meaning development costs for an MVP are minimal, and running costs will only need to be considered.

Once transcription is performed, the resulting text can be passed into an LLM agent that has been prompted to take notes from the transcription, and can even be used to fill out report templates. It’s again worth noting that prompted LLMs are limited in their accuracy, especially when applied to a specialised downstream task such as health and social care. There are fine-tuned speech-to-text models which have been trained in medical scenarios, however, since the specific nature of this data cannot be determined, special care will need to be taken to guide LLMs as much as possible, and there will need to be an evaluation period where humans are kept in the loop, to assess whether accuracy is sufficient to fully integrate into existing workflows.

Since this is a model that will be handling private medical information, and recordings of voice, careful consideration and costs will need to be allocated to understand the data lifecycle and ensure that GDPR is met, and that data is stored securely. Furthermore, since audio files and resulting text files will likely be quite long, running costs of such a system may be quite high, and detailed cost-benefit analysis will need to be conducted in order to assess the feasibility of implementation.

‍

Translation

Simple and immediate in its potential is the use of LLMs for translation between patients/residents and care providers. For example, during visits, speech-to-text can first record voice and then a separate LLM agent for translation can be used to translate the text to the target language, ensuring seamless communication. The entire pipeline, minus perhaps the interface, are being provided as out-of-the-box solutions from AI providers, therefore development costs are minimal and can be implemented rapidly.

‍

Data Cleaning & Querying

Unsurprisingly, cleaning of data is cross-cutting and equally applies to public and social services. The impact of doing this can help elevate health data and enable more bespoke care. Combined with other social datasets such as income, fuel poverty etc. a clearer understanding of the drivers of health outcomes may support the mitigation of inequalities.

‍

Chatbots and Sentiment Analysis

Chatbots can provide personalised advice at scale, acting as a first point of contact for patients seeking information about health concerns or social care services. They can support diagnostics by asking relevant follow-up questions and directing individuals to appropriate care pathways.

For remote monitoring, chatbots combined with sentiment analysis can engage patients in regular check-ins, collecting information on health status through carefully guided and curated questions. Sentiment Analysis systems can then be built on top of this collected data to flag potential issues based on mood or language patterns. It is stringently important that these systems are integrated into existing services, not just to cut costs and reduce human interaction with patients. Careful design can instead enhance and support the existing services, adding continuity to patient monitoring between visits and augmenting care by providing a clearer picture of patient health.

Despite this, the accuracy of these systems is a significant limitation, particularly when dealing with nuanced or sensitive medical information. They may struggle with the precision required for medical-grade applications. As discussed in other impact areas, governance on chatbot outputs can be difficult with purely prompted LLM systems. To address this, chatbot systems must be restricted to only answering only certain questions and outputs. This is achieved by assigning agents to certain queries, with their outputs restricted by specifying in the prompt.

Moreover, governance challenges must be addressed to ensure compliance with data privacy standards like GDPR. The handling of sensitive patient information necessitates robust data security and clear lifecycle management. These systems must undergo extensive testing and oversight before integration into workflows to guarantee they meet the accuracy, safety, and ethical standards expected of critical care provision. The opportunity for enhanced service delivery through chat bots is huge, however, solely prompted LLMs may not provide the sufficient robustness to achieve this potential.

‍