Graphcast Laptop
Google has launched a revolutionary new approach to weather forecasting. Instead of using a massive HPC infrastructure, they are using AI to predict the weather. They describe their method here and have also published a paper in Science. To get you started, they offer a setup on Colaboratory.
Since they promise that a forecast will use much fewer resources and should even run on my laptop, I wanted to give it a try.
I cloned the repository onto my Ubuntu 22 installation with Python 3.10.
It seemed to work right from the start, but soon some versioning conflicts with my Python version occurred.
After some tweaking all version conflicts were resolved, and here is my final pip freeze
:
(venv) difu@diggler ~/Programming/graphcast/graphcast main ± pip freeze
absl-py==2.0.0
anyio==4.2.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-lru==2.0.4
attrs==23.1.0
Babel==2.14.0
beautifulsoup4==4.12.2
bleach==6.1.0
cachetools==5.3.2
Cartopy==0.22.0
certifi==2023.11.17
cffi==1.16.0
chardet==3.0.4
charset-normalizer==3.3.2
chex==0.1.85
click==8.1.7
cloudpickle==3.0.0
colabtools==0.0.1
comm==0.2.0
contourpy==1.2.0
cycler==0.12.1
dask==2023.12.1
debugpy==1.8.0
decorator==5.1.1
defusedxml==0.7.1
dm-haiku==0.0.11
dm-tree==0.1.8
entrypoints==0.4
etils==1.6.0
exceptiongroup==1.2.0
executing==2.0.1
fastjsonschema==2.19.0
flax==0.7.5
fonttools==4.46.0
fqdn==1.5.1
fsspec==2023.12.2
google-api-core==1.34.0
google-auth==2.25.2
google-cloud-core==2.4.1
google-cloud-storage==1.44.0
google-crc32c==1.5.0
google-resumable-media==2.7.0
googleapis-common-protos==1.62.0
graphcast==0.1
idna==2.8
importlib-metadata==7.0.0
importlib-resources==6.1.1
ipykernel==6.27.1
ipython==8.18.1
ipython-genutils==0.2.0
ipywidgets==8.1.1
isoduration==20.11.0
jax==0.4.23
jaxlib==0.4.23
jedi==0.19.1
Jinja2==3.1.2
jmp==0.0.4
jraph==0.0.6.dev0
json5==0.9.14
jsonpointer==2.4
jsonschema==4.20.0
jsonschema-specifications==2023.11.2
jupyter-events==0.9.0
jupyter-lsp==2.2.1
jupyter_client==8.6.0
jupyter_core==5.5.0
jupyter_server==2.12.1
jupyter_server_terminals==0.5.0
jupyterlab==4.0.9
jupyterlab-widgets==3.0.9
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.2
kiwisolver==1.4.5
locket==1.0.0
markdown-it-py==3.0.0
MarkupSafe==2.1.3
matplotlib==3.8.2
matplotlib-inline==0.1.6
mdurl==0.1.2
mistune==3.0.2
ml-dtypes==0.3.1
msgpack==1.0.7
nbclient==0.9.0
nbconvert==7.12.0
nbformat==5.9.2
nest-asyncio==1.5.8
notebook_shim==0.2.3
numpy==1.26.2
opt-einsum==3.3.0
optax==0.1.7
orbax-checkpoint==0.4.8
overrides==7.4.0
packaging==23.2
pandas==2.1.4
pandocfilters==1.5.0
parso==0.8.3
partd==1.4.1
pexpect==4.9.0
pickleshare==0.7.5
Pillow==10.1.0
platformdirs==4.1.0
portpicker==1.2.0
prometheus-client==0.19.0
prompt-toolkit==3.0.43
protobuf==3.20.3
psutil==5.9.6
ptyprocess==0.7.0
pure-eval==0.2.2
pyasn1==0.5.1
pyasn1-modules==0.3.0
pycparser==2.21
Pygments==2.17.2
pyparsing==3.1.1
pyproj==3.6.1
pyshp==2.3.1
python-dateutil==2.8.2
python-json-logger==2.0.7
pytz==2023.3.post1
PyYAML==6.0.1
pyzmq==25.1.2
referencing==0.32.0
requests==2.31.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.7.0
rpds-py==0.14.1
rsa==4.9
Rtree==1.1.0
scipy==1.11.4
Send2Trash==1.8.2
shapely==2.0.2
simplegeneric==0.8.1
six==1.12.0
sniffio==1.3.0
soupsieve==2.5
stack-data==0.6.3
svgwrite==1.4.3
tabulate==0.9.0
tensorstore==0.1.51
terminado==0.13.3
tinycss2==1.2.1
tomli==2.0.1
toolz==0.12.0
tornado==6.4
traitlets==5.14.0
Tree==0.2.4
trimesh==4.0.5
types-python-dateutil==2.8.19.14
typing_extensions==4.9.0
tzdata==2023.3
uri-template==1.3.0
urllib3==1.24.3
wcwidth==0.2.12
webcolors==1.13
webencodings==0.5.1
websocket-client==1.7.0
widgetsnbextension==4.0.9
xarray==2023.12.0
zipp==3.17.0
Another problem is that the data needed is in a Google Cloud Bucket. Even though it is a public bucket, you need to have some kind of Google credentials to access the data. If you are on Colaboratory the vanilla code works fine, but I got errors when using it in my virtual machine. After some digging I decided to use a Google Cloud Service Account. I created a user with minimal privileges and created a key. Store the key in a safe place.
Change the cell Authenticate with Google Cloud Storage
from
# @title Authenticate with Google Cloud Storage
# TODO: Figure out how to access a public cloud bucket without authentication.
from google.colab import auth
auth.authenticate_user()
gcs_client = storage.Client()
gcs_bucket = gcs_client.get_bucket("dm_graphcast")
to
# @title Authenticate with Google Cloud Storage
# TODO: Figure out how to access a public cloud bucket without authentication.
# from google.colab import auth
# auth.authenticate_user()
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file(
'/path/to/key_file.json')
gcs_client = storage.Client(credentials = credentials)
gcs_bucket = gcs_client.get_bucket("dm_graphcast")
Note: As you can see from the comment, the authors noticed this problem as well.
After that, I was able to walk through my Jupyter-Lab and do some forecasting!
Have a look!
And it really runs on my laptop…
WHOOHAYYY!