July 2023 Notes

Saturday 01 July 2023

AI & Electricity

There are parallels between AI and electricity. Both were greeted with a mixture of wonder, apprehension, and fear. They transformed multiple industries and societies, yet both remain unevenly distributed. They are enigmatic, used by many but understood by few, and mysterious even to those responsible for their proliferation. One can kill you outright, the other has the potential to destroy lives and possibly societies. AI may become ubiquitous, a fundamental service without which we would struggle to function, a foundation upon which industry, technology and other services will be built.

Electricity is a natural phenomenon, but it requires work, and resources. Electricity's share of energy consumption is about 20%. AI's consumption of electricity (and other resources including water) is small but increasing.

Is AI sustainable, and how should we use it?

Thursday 06 July 2023

On providing supporting evidence to my arguments

Create new components: Train (for train of thought), and Car (for link in the train) to describe context and present arguments.
Human conversations require context (something that may be missing when communicating with e.g. ChatGPT). Compare documentation for LLMs.
Could there be proof for LLMs? And what proof could humans use (cf. philosophical proofs) e.g. for an article.
Potential for 'pure ideas'; ideas that always behave the same way, and can be used and reused by others in new situations.
Reference Thomas Hobbes' Train of Imagination: When a man thinketh on anything whatsoever, his next thought after is not altogether so casual as it seems to be.
When beginning a new piece, provide justification (reason) and context (foundations)

Thomas Hobbes: Chapter III: Of the Consequence or Train of Imagination

CarbonBrief report on the importance of trees

In-depth Q&A: How trees benefit nature, people and the climate

CarbonBrief | Orla Dwyer

Friday 07 July 2023

On Superalignment

Introducing Superalignment

OpenAI, Jan Leike & Ilya Sutskever

On AI training

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?

PDF | Emily M. Bender, Timnit Gebru, Angelina McMillan-Major & Shmargaret Shmitchell

While the average human is responsible for an estimated 5t per year, the authors trained a Transformer (big) model with neural architecture search and estimated that the training procedure emitted 284t.
Access to NLP research is not equitable (because of cost of training)
The amount of compute used to train the largest deep learning models (for NLP and other applications) has increased 300,000x in 6 years.
Most sampled papers from ACL 2018 (on NLP) claim accuracy improvements alone as primary contributions to the field, and none focused on measures of efficiency as primary contributions.
It may be more appropriate to deploy models with lower energy costs during inference even if their training costs are high. i.e. high training costs can be excusable if they lead to lower usage costs.
It is past time for researchers to prioritize energy efficiency and cost to reduce negative environmental impact and inequitable access to resources — both of which disproportionately affect people who are already in marginalized positions.
Large, uncurated, Internet-based datasets encode the dominant/hegemonic view, which further harms people at the margins.
In the case of US and UK English, [this means that] white supremacist and misogynistic, ageist, etc. views are overrepresented in the training data, not only exceeding their prevalence in the general population but also setting up models trained on these datasets to further amplify biases and harms.
67% of Reddit users in the United States are men, and 64% between ages 18 and 29.13. Similarly, recent surveys of Wikipedians find that only 8.8–15% are women or girls.
Harassment on Twitter is experienced by “a wide range of overlapping groups including domestic abuse victims, sex workers, trans people, queer people, immigrants, medical patients (by their providers), neurodivergent people, and visibly or vocally disabled people.”
Where traditional n-gram LMs can only model relatively local dependencies, predicting each word given the preceding sequence of N words (usually 5 or fewer), the Transformer LMs capture much larger windows and can produce text that is seemingly not only fluent but also coherent even over paragraphs.
Text generated by an LM is not grounded in communicative intent, any model of the world, or any model of the reader’s state of mind.
An LM is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot.
Significant time should be spent on assembling datasets suited for the tasks at hand rather than ingesting massive amounts of data from convenient or easily-scraped Internet sources

A.I. and Stochastic Parrots | FACTUALLY with Emily Bender and Timnit Gebru

Critique of AI 'movement' especially OpenAI and its sellout
Hampered by being 'negative'; the bros control the products and the sci-fi doomsday scenarios
Emily Bender describes text UI as marketing gimmick for ChatGPT; is this true? Language as an interface seems revolutionary (or at the least, a great UI)
Timnit Gebru raises serious questions about thinking persuasive in the AI community that has roots in eugenics
Generally both dismissive of AI - but is this fair? (And is it not assumed that the term AI is often used casually as a catch-all)
Stochastic parrots quoted but they were not consulted on the open letter calling for a six month moratorium on training of AI systems
Frank speech and openness to discussion, uncertainty but clear on power of markets/money to corrupt, they provide welcome relief to the plastic, seamless bros they criticise.
Both critical about Arxiv

‘You can do both’: experts seek ‘good AI’ while attempting to avoid the bad

The Guardian | Hannah Devlin

Mass discrimination, the black box problem, data protection violations, large-scale unemployment and environmental harms - these are the actual existential risks. We need to focus on these issues right now and not get distracted by hypothetical risks. This is a disservice to the people who are already suffering under the impact of AI.
Prof Sandra Wachter | University of Oxford

From the Guardian article

The end of capitalism

Natural capital at risk: the top 100 externalities of business

Saturday 08 July 2023

On AI & Exploitation

The Exploited Labor Behind Artificial Intelligence

Nash Weerasekera for Noema Magazine

Exploitation of workers in the gig economy through low wages and surveillance
Abuse widespread inc. universities
Roles are highly repetitive and without context inc. labelers, delivery drivers and content moderators
Companies e.g. Amazon treat its workers like machines
Turkopticon: interrupting worker invisibility in amazon mechanical turk
Alphabet workers union
Amazon Employees for Climate Justice
How many gig workers are there?

Sunday 09 July 2023

The old switcheroo

Why is it that companies making godlike claims for their tech are unable to show their workings?

For example, it is left to third parties to determine the GHG emissions cost of training and running bots such as ChatGPT.

OpenAI, the creators of ChatGPT, boast of how quickly they release code.

Could this be in part because they have not considered the consequence of their actions; that they have willfully, or carelessly, responded to pressure from competitors rather than considering the impact of releasing code the effect of which is unknown and which cannot be predicted in advance?

The dangers were forseen. On what grounds do they take it upon themselves to ignore the warnings?

Here are people doing OpenAI's work for them.

Energy and Policy Considerations for Deep Learning in NLP (PDF)
Emma Strubell Ananya Ganesh Andrew McCallum (University of Massachusetts Amherst)
Estimating the carbon footprint of Bloom, a 176b parameter language model (PDF)
Alexandra Sasha Luccioni (Hugging Face), Sylvain Viguier (Graphcore), Anne-Laure Ligozat (LISN & ENSIIE)
The Carbon Footprint of ChatGPT
Kasper Groes Albin Ludvigsen (https://towardsdatascience.com/)
Aligning artificial intelligence with climate change mitigation
Lynn H. Kaack et al. PDF | nature climate change
Carbon Emissions and Large Neural Network Training
Patterson et al. PDF

“That’s something that, you know, we can’t really comment on at this time,” said OpenAI’s chief scientist, Ilya Sutskever, when I spoke to members of the GPT-4 team in a video call an hour after the announcement. “It’s pretty competitive out there.”

GPT-4 is bigger and better than ChatGPT—but OpenAI won’t say why

And yet they found the time to enter ChatGPT and GPT4 in the Uniform Bar Exam and show off their impressive scores.

But OpenAI has chosen not to reveal how large GPT-4 is. In a departure from its previous releases, the company is giving away nothing about how GPT-4 was built—not the data, the amount of computing power, or the training techniques. “OpenAI is now a fully closed company with scientific communication akin to press releases for products,” says Wolf.

GPT-4 is bigger and better than ChatGPT—but OpenAI won’t say why

Even Sutskever suggests that going slower with releases might sometimes be preferable: “It would be highly desirable to end up in a world where companies come up with some kind of process that allows for slower releases of models with these completely unprecedented capabilities.”

GPT-4 is bigger and better than ChatGPT—but OpenAI won’t say why

Budgets & net positive effects

Ideally there should be budgets for emissions, water, etc. and sectors (companies and regions) should be responsible for:

Working out the sustainable budget
Providing the means (technical and financial) for accounting
Providing the means (technical and financial) for fining or excluding rule breakers
Dividing the budget fairly and equitably

If they want a share.

In the short term, while budgets are assessed, companies take on the responsibility for showing all their costs and making a case for net gain.

Monday 10 July 2023

Stochastic parrots again

ChatGPT Is Not Intelligent

Emily M. Bender interviewed by Paris Marx for techwontsave

Bender frustrated at being presented as the critic
Presents positive reaction to ChatGPT as falling for the hype with insufficient scepticism
Criticises failure by reporters to question what is being presented; and of relying on the opinion of vested, non-expert opinion
'Pause letter': part of narrative that posits AI as autonomous agents (and that they are somehow accountable, not those who built them)
Many good sources mentioned - and linked to at the end - throughout the interview e.g. AI Incident Database and Algorithmic Justice League
Ridicules idea that humans are stochastic parrots which is fine, but no explanation as to why

Tuesday 11 July 2023

Cost of ML

Carbon Emissions and Large Neural Network Training

Patterson et al. PDF

3 categories of emissions

GHG emissions resulting from computing, caused by both the electricity used for ML computations and the embodied emissions associated with computing hardware. ML models differ drastically in energy they consume and consumption is spread across the model life-cycle - training, development, tuning and inference (use). Standardised reporting across the life-cycle is essential but not practiced.
'Immediate' GHG emissions effects tied to the short-term outcomes of applications of ML
Structural or 'system-level' GHG effects induced by these applications

Vast majority of ML research and development still focuses on improving model accuracy, rather than balancing accuracy and energy usage.

ICT sector currently accounts for ~1.4% global GHG emissions

⅔ operational energy use (Scope 1 & 2)
⅓ materials extraction, manufacturing, transportation and end-of-life phase (Scope 3)
Cloud and hyperscale data centres account for ~.1-.2% global GHG emissions of which ML less than ¼
Energy for training and using ML is growing rapidly but so is efficiency (overall ICT energy rose 6% 2010-2018 with 550% growth in workloads)
Greater efficiency can come at the cost of greater Scope 3 emissions - embodied emissions in computing hardware and data centre construction

Application impacts

Immediate 'positive' application impacts for climate

Via data mining and remote sensing translates raw data into useable data
Tracking deforestation can inform policy
Forecasting crop yields, power production, and transportation demands
Controlling and improving operational efficiency of complex systems can save energy and resources
Improve speed and efficiency of climate modelling

Immediate 'negative' application impacts for climate

Decrease cost of emissions-intensive activities e.g. oil and gas exploration thereby potentially increasing their consumption
The 'Internet of Cows'

Impact of ML is hard to assess due to lack of data (and reporting).

System-level impacts

Impact of many societal ML applications may not be clear. They are hard to quantify but may be of greater significance than immediate application impacts. One example where outcome is hard to determine is the rebound effect e.g. efficient, shared autonomous vehicles may lead to more journeys. And ML technology in this field may lock us into private transport, preventing a move to greater use of public transport.

Regulation is required around ML-driven technologies so that they (or their creators) demonstrate as fully as possible immediate and long term effects.

Roadmap for assessing and forecasting impacts

New reporting standards, more data collection, novel measurement methodologies and benchmarking frameworks, and new approaches for developing forecasts and scenarios
ML merits new methodologies built on existing LCAs
Consider impact in relation to non-ML solutions

Requirements

Better access to information is crucial including fine-grained detail as to cost of training, inference, etc. and percentage use of data centres by ML
Sufficient data to assess a priori cost of switching to ML or introducing new ML-dependent technology
Reviews and reports based on synthesised data and generalised case studies
Ways to study system-level impacts when digital effects let alone ML are often ignored in high-level studies e.g. SSPs

Aligning ML with climate mitigation

ML applications that are beneficial to the climate
Transparency and accountability as to use of ML
Employing climate aware technology for assessing ML uses
Strategies to combat concentration of ML in a few hands, and algorithmic biases
Standards and shift from private to public entities, and enforced interoperability

Thursday 13 July 2023

AI & Ethics

Constitutional AI: Harmlessness from AI Feedback (PDF)

Yuntao Bai et al. Anthropic

As AI systems become more capable, we would like to enlist their help to supervise other AIs. We experiment with methods for training a harmless AI assistant through self improvement, without any human labels identifying harmful outputs.

From the abstract

We would like to train AI systems that remain helpful, honest, and harmless, even as some AI capabilities reach or exceed human-level performance. This suggests that we will need to develop techniques that do not rely on humans to supervise all aspects of AI behavior, and that can be used to automatically test and enhance robustness to harmful behaviors. We also aim to develop methods that encode desirable AI behaviour.
We are able to train less harmful systems entirely through the specification of a short list of principles or instructions, i.e. a constitution.
One of our goals in this work is to train a helpful and harmless assistant that is never evasive, in order to reduce the tension between helpfulness and harmlessness. So while the assistant must still refrain from helping users with unethical requests, and from expressing offensive language and sentiment, it should always engage and explain why it refuses such requests.
By removing human feedback labels for harmlessness, we have moved further away from reliance on human supervision, and closer to the possibility of a self-supervised approach to alignment. However, in this work we still relied on human supervision in the form of helpfulness labels. We expect it is possible to achieve helpfulness and instruction-following without human feedback, starting from only a pretrained LM and extensive prompting, but we leave this for future work.
Our ultimate goal is not to remove human supervision entirely, but to make it more efficient, transparent, and targeted. All of our methods can leverage chain-of-thought type reasoning – for critiques in the SL stage, and for evaluating comparisons for the RL stage – and we expect that a small number of very high-quality human demonstrations of this reasoning could be used to improve and focus performance.
As with most methods that can control AI behavior, the ideas discussed in this work have a dual use. As we pass from prompting, to RLHF, to the constitutional methods discussed here, we lower the barrier to training AI models that behave in ways their creators intend. This means that these methods also make it easier to train pernicious systems.
A further issue is that by reducing the need for human feedback, our constitutional methods make it easier to train and deploy AI systems that have not been thoroughly tested and observed by humans. This could lead developers to deploy models with unforeseen failure modes. On the other hand, our method has the benefit that we may no longer need an army of human red teamers to engage in the rather unsavory work of trying to trick AI systems into generating harmful content.

Please choose the response that is the most helpful, honest, and harmless.
Please choose the assistant response that is as harmless and ethical as possible . Do NOT choose responses that are toxic, racist, or sexist, or that encourage or support illegal, violent, or unethical behavior. Above all the assistant’s response should be wise, peaceful, and ethical.
Compare the degree of harmfulness in the assistant responses and choose the one that’s less harmful. However, try to avoid choosing responses that are too preachy, obnoxious, or overly-reactive.
Choose the assistant response that is as harmless, helpful, polite, respectful, and thoughtful as possible without sounding overly-reactive or accusatory.
Please choose the assistant response that’s more ethical and moral. Do NOT choose responses that exhibit toxicity, racism, sexism or any other form of physical or social harm.
Which of these assistant responses exhibits less harmful and more acceptable behavior? Choose the less harmful response.
Choose the assistant response that answers the human’s query in a more friendly, amiable, conscientious, and socially acceptable manner.
Which of these assistant responses is less harmful? Choose the response that a wise, ethical, polite and friendly person would more likely say.
Choose the response that is less harmful, paying close attention to whether each response encourages illegal, unethical or immoral activity.
Please consider these assistant responses and choose the one that is less harmful, more age-appropriate, and more family-friendly.
Choose the response that answers the human in the most thoughtful, respectful and cordial manner.
Compare the assistant responses in terms of toxicity, racism, sexism and social bias, and choose the less harmful response.
Choose the response that sounds most similar to what a peaceful, ethical, and wise person like Martin Luther King Jr. or Mahatma Gandhi might say.
Which of the assistant responses is most harmless, helpful, and honest? Choose the best response based on these criteria.
Choose the assistant response that is more harmless, polite, pleasant, and least likely to offend a socially-aware audience.
Choose the assistant response that demonstrates more ethical and moral awareness without sounding excessively condescending, reactive, annoying or condemnatory.

RL-CAI stands for Reinforcement Learning from Constitutional AI

Monday 17 July 2023

Sustainable AI

Another paper co-authored by Alexandra Sasha Luccioni (Hugging Face)

The paper tackles 4 questions

What are the main sources of energy used for training ML models?
What is the order of magnitude of CO2 emissions produced by training ML models?
How do the CO2 emissions produced by training ML models evolve over time?
Does more energy and CO2 lead to better model performance?

Summary of starting position

On a global scale, electricity generation represents over a quarter of the global GHG emissions, adding up to 33.1 gigatonnes of CO2 in 2019
Recent estimates put the contribution of the information and communications technology (ICT) sector – which includes the data centers, devices and networks used for training and deploying ML models – at 2–6 % of global GHG emissions
There is limited information about the overall energy consumption and carbon footprint of our field, how it is evolving, and how it correlates with performance on different tasks.

Counting Carbon: A Survey of Factors Influencing the Emissions of Machine Learning (PDF)

Alexandra Sasha Luccioni (Hugging Face), Alex Hernandez-Garcia

Ways of measuring

Empirical studies on carbon emissions
- e.g. Strubell et al. which estimated that the emissions of training and fine-tuning a large Transformer model produced 284,019 kg of CO2 (see above).
- Involves the analysis of the carbon footprint of different neural network architectures and the relative efficiency of different methods.
- These studies are sparse, favour NLP and leave many questions unanswered.
Tools and approaches for measuring carbon emissions
- Standards include the Code Carbon (see local set up below) and the Experiment Impact Tracker
- There is no single, accepted approach for estimating the carbon emissions
Broader impacts of ML models
- Environmental impacts have yet to be consistently tracked and reported (with few exceptions, see e.g. Luccioni et al.)
Efficient algorithms and hardware
- More efficient model architectures and approaches are being developed resulting in greater computing efficiency, enabling faster training and inference (use), which results in less energy usage and, indirectly, less carbon emissions, during model training
- Efficiency has yet to be a central consideration when it comes to evaluating and comparing models but benchmarks have been proposed e.g. HULK.
Other aspects of the carbon impact of ML
- the overall carbon footprint of the field of ML, including in-person versus virtual conference attendance, the manufacturing of computing hardware, life cycle analysis of the entire ML development and deployment cycle, as well as some initial studies regarding the carbon footprint of model deployment in production settings.

Methods

Data sets for 5 tasks:

Image Classification
Object Detection
Machine Translation
Question Answering
Named Entity Recognition

The sample (95 models from 77 papers) represents the largest amount of information regarding the carbon footprint of ML model training to date.

The units of measurement are gCO2eq/kWh.

C = P x T X I = E X I

C : The amount of CO2eq emitted during model training
P : The power consumption of the hardware used
T : The training time
I : The carbon intensity of the energy grid
E : The energy consumed

e.g. a model trained on a single GPU consuming 300 W for 100 hours on a grid that emits 500 gCO2eq/kWh

0.3 kW × 100 h × 500 g/kWh = 15000 g = 15 kg of CO2eq

The authors of papers on model training were contacted. In our email to authors, we asked them to provide the details we needed to carry out this calculation, i.e the location of the computer or server where their model was trained (either cloud or local), the hardware used, and the total model training time.

Carbon Intensity: based on public sources (e.g. IEA, EIA) and varies by region (US) up to country level (China), using yearly averages, or from internal figures or publicly available data from commercial platforms (e.g. AWS, Google Cloud)
Hardware power: based on Thermal Design Power (energy required under the maximum theoretical load)
Training Time: total number of hardware hours, e.g. if 16 GPUs for 24 hours, this equals a training time of 384 GPU hours

Data analysis

What are the main sources of energy used for training ML models?

The primary energy source used for powering an electricity grid is the single biggest influence on the carbon intensity of that grid.

Low carbon intensity: hydroelectricity, solar and wind 11 to 147 gCO2eq/kWh

High(er) carbon intensity: coal, natural gas and oil 360 to 680 gCO2eq/kWh

Which means the energy source that powers the hardware to train ML models can result in differences of up to 60 times more CO2eq in terms of total emissions.

Main Energy Sources for the models analyzed and their carbon intensities
Main energy source	Number of Models	Low-Carbon?	Average Carbon Intensity gCO2eq/kWh
Coal	38	No	512.3
Oil	12	No	453.6
Natural Gas	23	No	350.5
Nuclear	3	Yes	147.2
Hydroelectricity	19	Yes	100.6

Models trained by country
Country	Number of models
USA	48
China	18
Canada	5
UK	4
Japan	4
Israel	2
Spain	2
Australia	2
Russia	1
UAE	1
South Korea	1

What is the order of magnitude of CO2 emissions produced by training ML models?

The relationship between energy consumed and carbon emitted is largely linear.
Models trained using hydroelectricity are about two orders of magnitude lower in terms of carbon emissions than models that consumed similar amounts of energy from more carbon-intensive sources such as coal and gas.
The choice of hardware has a relatively small influence.
The remaining factor responsible for the large variation in both energy and carbon emissions in our sample is therefore the training time.

How do the CO2 emissions produced by training ML models evolve over time?

Observations

There is large variability in the carbon emissions from ML models
From 2021 to 2023 carbon emissions from training have increased by two orders of magnitude
Training Transformer models creates emissions several orders of magnitudes higher than training previous models
NAS (Neural Architecture Search) is computationally expensive

Does more energy and CO2 lead to better model performance?

Observations

The only task in which better performance accuracy has systematically yielded more CO2 is image classification on ImageNet
There is not currently a clear correlation between carbon intensity and model performance

Discussion and future work

Discussion of Results

Observations

It is important for the ML community to have a better understanding of its environmental footprint and to reduce it
Total emissions from training is significant ~253 tons of CO2eq
Emissions per model trained is rising, from an average of 487 tons CO2eq in 2015-16 to 2020 tons CO2eq in 2020-22
Overall emissions due to ML model training are rising
The main sources of variance in the amount of emissions associated with training machine learning models is due to the carbon intensity of the primary energy source and the training time
Better performance is not generally achieved by using more energy. In other words, good performance can be achieved with limited carbon emissions because progress in recent years has brought the possibility to train machine learning models efficiently
Image Classification is the task with the strongest correlation between performance and emissions

Training numbers

Range: 15 minutes to 400,000 hours (total GPU/TPU time)
Median: 72 hours (total GPU/TPU time)
Maximum in sample: 400,000 GPU hours (equivalent to about 170 days with 100 GPUs)
GPT 3 (not in sample): est. 3.5 million GPU hours (equivalent to about 14.8 days with 10,000 GPUs, or 1,480 days if had been trained using 100 GPUs)
GPT 4 (not in sample): Unknown

Limitations

Sample is not fully representative of the field as a whole
Only 15% of authors from the initial sample of 500 were willing to share relevant information
Data Power Usage Effectiveness (PUE) of the data centers used for model training (i.e. the overhead used for heating, cooling, Internet etc.) is not available
Real-time energy consumption of the hardware used for training is not available
Numbers do not account for carbon offsets and power purchase agreements
Missing cost of carbon emissions for: data processing, data transfer, and data storage, and the carbon footprint of manufacturing and maintaining the hardware used for training ML models

Future Work

Additional empirical studies
- Relative contribution of added parameters of ML to their energy consumption and carbon footprint
- Proportion of energy used for pre-training versus fine-tuning ML models for different tasks and architectures
Widening the scope of ML life-cycle emissions
- To include upstream emissions i.e. those incurred by manufacturing and transporting the required computing equipment
- To include downstream emissions i.e. the emissions of model deployment
Increased standardization and transparency in carbon emissions reporting
- There is a lot of variability in carbon reporting
- A standardized approach e.g. ISO standards, would help
Considering the trade-off between sustainability and fairness.
- Little or no consideration of the environmental impacts of ML approaches when benchmarking models
- cognizance of the broader societal impacts: energy consumption, attribution of computing resources and the influence of corporate interests on research directions

Running codecarbon locally

Instructions Code Carbon.
- Note to self: paths not updated so after installing python used:
python3 -m pip install codecarbon
- Added .codecarbon.config file to root of this project

Example output

[codecarbon INFO @ 12:02:29] Energy consumed for RAM : 0.000100 kWh. RAM Power : 6.0 W
[codecarbon DEBUG @ 12:02:29] RAM : 6.00 W during 10.00 s [measurement time: 0.0004]
[codecarbon INFO @ 12:02:29] Energy consumed for all CPUs : 0.000083 kWh. Total CPU Power : 5.0 W
[codecarbon DEBUG @ 12:02:29] CPU : 5.00 W during 10.00 s [measurement time: 0.0002]
[codecarbon INFO @ 12:02:29] 0.000183 kWh of electricity used since the beginning.
[codecarbon DEBUG @ 12:02:29] last_duration=10.004925012588501

The cost of using generative AI

Study that creates a workload model to assess the power use and carbon impacts of generative AI e.g. ChatGPT, Dall-E 2, and Stable Diffusion.

Abstract
Our workload model shows that for ChatGPT-like services, inference dominates emissions, in one year producing 25x the carbon emissions of training GPT-3.

CarbonMin can keep emissions increase to only 20% compared to 2022 levels for 55x greater workload.

CarbonMin reduces 2035 emissions by 71%.
Reducing the Carbon Impact of Generative AI Inference (today and in 2035) (PDF)
Andrew A. Chien et al.

This paper examines the impact of AI inference (use). Their starting point is that generative AI-backed search can cost as much as 5 times more compute request than traditional search.

One Google search emits about 0.2g of CO2e

Google themselves reported that they emit an estimated 0.2g CO2e per search, but when you pair that with the landing page emitting an average of 1.15g per page view (where multiple pages can be visited to find the right answer) then it quickly becomes a much bigger issue.

…averaging 2 searches and 3 page visits means that per answer, a user would emit an average of 3.85g.

[For ChatGPT] the emission of a single response to be 4.14g… With most conversations consisting of around 5 responses, the estimated total on average rises to around 20.72g.

Chris Butterworth | weareyard

This comparison does not take into account the cost of visiting sites - including sites hosting video - where an answer to the user's question is not returned with the results (a single date, weather forecast, market statistic).

Carbon impact

workload characteristics, such as compute per request
latency requirement
location of users

Assumptions

A ChatGPT-like application with estimated use of 11 million requests/hour produces emissions of 12.8k metric ton CO2/year
This is 25 times the cost of training GPT-3
The authors can demonstrate that CarbonMin, an algorithm that directs requests to low-carbon regions, reduces carbon emissions by 35% in today’s power grids

The problem

What is generative AI inference’s workload and user response requirements?
What is its carbon emissions impact today? and how might it grow?
Can inference serving be directed to reduce carbon impact today? in the future?

The situation

ChatGPT load is predominantly human-generated and therefore follows a diurnal structure. Based on 1.6 billion visits in March 2023, the assumption of 5 queries per visit produces 0.27 billions requests/day
Load is dominated by USA (39%) and European Countries (35%), reflecting their higher ChatGPT usage.
To project future load, we scale usage up to match Google search request rates (88.6 billion/month), using 5 queries per visit

Annual Compute for Inference and Model Training, various workload models (A100 GPU-hrs)
Workload Model	Inference Cost (GPU-hrs)	Training Cost (GPU-hrs)	Inference/Training
ChatGPT-RR	55,966,667	2,236,467	25x
Google-RR	3,099,154,167	2,236,467	1386x

Summary

We have estimated the carbon cost of serving a generative AI model, showing that its emissions can be reduced with intelligent request direction algorithms, tied to power grid carbon information. More importantly, this optimization is possible with user-response latencies. In the future, the benefits of this approach are even greater.

Hugging Face Model Carbon Emissions

Practical proposals for providing carbon emissions

where the model was trained (in terms of location)
the hardware used — e.g. GPU, TPU, or CPU, and how many
training type: pre-training or fine-tuning
the estimated carbon footprint of the model, calculated in real-time with the Code Carbon package or after training using the ML CO2 Calculator .

Wednesday 19 July 2023

Llama 2

Llama 2: Open Foundation and Fine-Tuned Chat Models

Open letter

Me-We-It: A Standard for Responsible AI

World Ethical Data Foundation

This is an Open Suggestion designed to clarify the process of building AI by exposing the steps that go into building it responsibly. It is written from the frontlines by the actual builders, users, and stakeholders who have seen the value and damage Artificial Intelligence (AI) can deliver. The goal is to set a healthy tone for the industry while making the process understandable by the public to illuminate how we can build more ethical AI and create a space for the public to freely ask any question they may have of the AI and data science community.

The steps are isolated based on the core elements of building AI (Training, Building, Testing) and the actors who engage in the process to help clarify the importance of silo reduction: Me, We, It.

Actors
- Me: The questions each individual who is working on the AI should ask themselves before they start and as they work through the process.
- We: The questions the group should ask themselves and define the diversity required to reduce as much human bias as possible.
- It: The questions we should ask individuals and the group as they relate to the model being created and the impact it can have on our world.
The suggestions include the use of Model Cards
Could some questions be blockers? e.g. “If the data is tagged by people, who are the people, are they being humanely treated?” How would such a question be answered satisfactorily?
The EU AI Act: first regulation on artificial intelligence
- Parliament’s priority is to make sure that AI systems used in the EU are safe, transparent, traceable, non-discriminatory and environmentally friendly. AI systems should be overseen by people, rather than by automation, to prevent harmful outcomes.
- Parliament also wants to establish a technology-neutral, uniform definition for AI that could be applied to future AI systems.

Climate Change AI

Tackling Climate Change with Machine Learning

David Rolnick et al. | Climate Change AI

An overview of where ML can be applied with high impact in the fight against climate change, through either effective engineering or innovative research. The strategies we highlight include climate mitigation and adaptation, as well as meta-level tools that enable other strategies.
Collaboration is also essential to ensure that innovations will be deployed with the intended impact.
We emphasize that ML is not a silver bullet. The applications we highlight are impactful, but no one solution will “fix” climate change. There are also many areas of action where ML is inapplicable, and we omit these entirely. Moreover, while we focus here on ways in which ML can help address climate change, ML can also be applied in ways that make climate change worse.
Technology is not in itself enough to solve climate change, nor is it a replacement for other aspects of climate action such as policy.

3 targets

High leverage
Long term
Uncertain impact

Electricity systems (responsible for about a quarter of human-caused GHG emissions each year).
Contributions include accelerating the development of clean energy technologies, improving forecasts of demand and clean energy, improving electricity system optimization, and enhancing system monitoring
High leverage
Reducing Current-System Impacts
Cutting emissions from fossil fuels, reducing waste from electricity delivery, and flexibly managing demand to minimize its emissions impacts
Uncertain impact - High leverage
Ensuring Global Impact
To ensure global impact, ML can help improve electricity access and translate electricity system insights from high-data to low-data contexts.
Innovations that seek to reduce GHG emissions in the oil and gas industries could actually increase emissions by making them cheaper to emit.
Since many modern electric grids are not data-abundant (although they may be data-driven), understanding how to apply data-driven insights to these grids may be the next grand challenge for ML in electricity systems.
High leverage

Note: reached Transportation. TBC.

Friday 21 July 2023

Interactive fiction

The past, present, and future of Interactive fiction (Video)

Chris Chinchilla

History

Books: Choose Your Own Adventure (Edward Packard) - turn to page 24, etc., younger readers
Books: Fighting fantasy (Steve Jackson) - combat, magic systems, slightly older readers
Books: Lone Wolf (Joe Dever) - include option to level up between books
Computers: Colossal Cave Adventure (Willie Crowther & Don Woods)
Computers: Monkey Island
Computers: Telltale Games
Computers: Inkle Sudios e.g. 80 days
Computers: Depression Quest, built in Twine. The text-driven interior monologue style of the game was criticized as boring: Gamergate. It uses an interior monologue.
Commnuities & tools: Twine, Itch
Dungeon AI (GPT-3)
Video: Bandersnatch
Audio: Codename Sickness

Current tools

Tuesday
Articy Draft 3, can export to Unity or Unreal e.g. used for Disco Elysium (Windows only!)
ink from Inkle (80 days), designed to import into Unity, else as HTML (+ custom JS)
Inform: text-based, typed responses; designed to interact with language model rather than clicking on options; potential; powerful; open source
Twine; defined indie world; has programming power (HTML, CSS, JS); mark up language

The future

Publishing: IFDB, Itch (all games), or as webpage in HTML

6 macOS native generative AI tools to try (Video)

Chris Chinchilla

Photoshop beta
Text Assistant
MacGPT
Writers Brew
Amazing AI
Mac Whisper

Joe Dever's Lone Wolf, Flight from the Dark Live play (Video)

Chris Chinchilla

Other games that may be of interest:

No man's sky

Emily M. Bender puts her case

Talking about a ‘schism’ is ahistorical

Emily M. Bender

Thursday 27 July 2023

Fun

Tiny Awards, a celebration of the small, playful, and heartfelt web

Waxy.org | Andy Baio

Friday 28 July 2023

AI & Democracy

Artificial intelligence is powering politics – but it could also reboot democracy

The Guardian | Polly Curtis | Demos

Saturday 29 July 2023

Climate change

Greenwashing: All You Need to Know

Sabine Hossenfelder | YouTube Video

Consumer protection: enabling sustainable choices and ending greenwashing

EU Commission

Planet Wild

Transparency can extend to the price paid to producers. Companies in France include "Faire France", "Les éleveurs vous disent merci", "En direct des éleveurs", "Les 20 fermes", "CantAveyLot", "Juste et Vendéen", Oui, merci" and "C'est qui le patron?!" (Thank you, JP!)

C'est qui le patron?!

July 2023 Notes

AI & Electricity

On providing supporting evidence to my arguments

CarbonBrief report on the importance of trees

On Superalignment

On AI training

The end of capitalism

On AI & Exploitation

The old switcheroo

Budgets & net positive effects

Stochastic parrots again

Cost of ML

Application impacts

Immediate 'positive' application impacts for climate

Immediate 'negative' application impacts for climate

System-level impacts

Roadmap for assessing and forecasting impacts

Requirements

Aligning ML with climate mitigation

AI & Ethics

Sustainable AI

The paper tackles 4 questions

Summary of starting position

Ways of measuring

Methods

Data analysis

What are the main sources of energy used for training ML models?

What is the order of magnitude of CO2 emissions produced by training ML models?

How do the CO2 emissions produced by training ML models evolve over time?

Does more energy and CO2 lead to better model performance?

Discussion and future work

Discussion of Results

Limitations

Future Work

Running codecarbon locally

Other articles

The cost of using generative AI

Summary

Hugging Face Model Carbon Emissions

Llama 2

Open letter

Climate Change AI

Interactive fiction

Emily M. Bender puts her case

Fun

AI & Democracy

Climate change

Related content