-
Notifications
You must be signed in to change notification settings - Fork 46
Dockerfiles changes and uv introduction #2139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
64db7f0 to
54420f5
Compare
…reduce overall size; switch to using uv instead of poetry
…current almalinux image
|
This error occurs the first time the site worker is launched is celerybeat-schedule already exists. The site worker creates a new file called celerybeat-schedule.db and erases the old one, and everything seems to be working fine afterward. I am not sure if we want to investigate this further or not before merging this PR Here is the error log: 2026-02-06 16:37:29.520 | ERROR | celery.beat:_destroy_open_corrupted_schedule:512 - Removing corrupted schedule file 'celerybeat-schedule': error('db type is dbm.gnu, but the module is not available')
Traceback (most recent call last):
File "/.venv/lib/python3.10/site-packages/kombu/utils/objects.py", line 42, in __get__
return obj.__dict__[self.__name__]
│ │ │ └ 'scheduler'
│ │ └ <kombu.utils.objects.cached_property object at 0x76d9b6f69600>
│ └ <attribute '__dict__' of 'Service' objects>
└ <celery.beat.Service object at 0x76d9b6f6aa10>
KeyError: 'scheduler'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/.venv/bin/celery", line 10, in <module>
sys.exit(main())
│ │ └ <function main at 0x76d9bd06c0d0>
│ └ <built-in function exit>
└ <module 'sys' (built-in)>
File "/.venv/lib/python3.10/site-packages/celery/__main__.py", line 16, in main
_main()
└ <function main at 0x76d9bc28ed40>
File "/.venv/lib/python3.10/site-packages/celery/bin/celery.py", line 322, in main
cmd.execute_from_commandline(argv)
│ │ └ None
│ └ <function CeleryCommand.execute_from_commandline at 0x76d9bc28f490>
└ <celery.bin.celery.CeleryCommand object at 0x76d9bd097e80>
File "/.venv/lib/python3.10/site-packages/celery/bin/celery.py", line 499, in execute_from_commandline
super(CeleryCommand, self).execute_from_commandline(argv)))
│ │ └ ['/.venv/bin/celery', '-A', 'celery_config', 'worker', '-B', '-Q', 'site-worker', '-l', 'info', '-n', 'site-worker@%n', '--co...
│ └ <celery.bin.celery.CeleryCommand object at 0x76d9bd097e80>
└ <class 'celery.bin.celery.CeleryCommand'>
File "/.venv/lib/python3.10/site-packages/celery/bin/base.py", line 305, in execute_from_commandline
return self.handle_argv(self.prog_name, argv[1:])
│ │ │ │ └ ['/.venv/bin/celery', 'worker', '-B', '-Q', 'site-worker', '-l', 'info', '-n', 'site-worker@%n', '--concurrency=2']
│ │ │ └ 'celery'
│ │ └ <celery.bin.celery.CeleryCommand object at 0x76d9bd097e80>
│ └ <function CeleryCommand.handle_argv at 0x76d9bc28f400>
└ <celery.bin.celery.CeleryCommand object at 0x76d9bd097e80>
File "/.venv/lib/python3.10/site-packages/celery/bin/celery.py", line 491, in handle_argv
return self.execute(command, argv)
│ │ │ └ ['worker', '-B', '-Q', 'site-worker', '-l', 'info', '-n', 'site-worker@%n', '--concurrency=2']
│ │ └ 'worker'
│ └ <function CeleryCommand.execute at 0x76d9bc28f1c0>
└ <celery.bin.celery.CeleryCommand object at 0x76d9bd097e80>
File "/.venv/lib/python3.10/site-packages/celery/bin/celery.py", line 415, in execute
return cls(
└ <class 'celery.bin.worker.worker'>
File "/.venv/lib/python3.10/site-packages/celery/bin/worker.py", line 223, in run_from_argv
return self(*args, **options)
│ │ └ {'app': None, 'broker': None, 'result_backend': None, 'loader': None, 'config': None, 'workdir': None, 'no_color': None, 'qui...
│ └ []
└ <celery.bin.worker.worker object at 0x76d9bd036a10>
File "/.venv/lib/python3.10/site-packages/celery/bin/base.py", line 253, in __call__
ret = self.run(*args, **kwargs)
│ │ │ └ {'app': None, 'broker': None, 'result_backend': None, 'loader': None, 'config': None, 'workdir': None, 'no_color': None, 'qui...
│ │ └ ()
│ └ <function worker.run at 0x76d9bc28eb90>
└ <celery.bin.worker.worker object at 0x76d9bd036a10>
File "/.venv/lib/python3.10/site-packages/celery/bin/worker.py", line 259, in run
worker.start()
│ └ <function WorkController.start at 0x76d9bbd26950>
└ <Worker: site-worker@34e82492e0fe (running)>
File "/.venv/lib/python3.10/site-packages/celery/worker/worker.py", line 208, in start
self.blueprint.start(self)
│ │ │ └ <Worker: site-worker@34e82492e0fe (running)>
│ │ └ <function Blueprint.start at 0x76d9bbd3f0a0>
│ └ <celery.worker.worker.WorkController.Blueprint object at 0x76d9b6a489a0>
└ <Worker: site-worker@34e82492e0fe (running)>
File "/.venv/lib/python3.10/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
│ │ └ <Worker: site-worker@34e82492e0fe (running)>
│ └ <function StartStopStep.start at 0x76d9bbd3fc70>
└ <step: Beat>
File "/.venv/lib/python3.10/site-packages/celery/bootsteps.py", line 369, in start
return self.obj.start()
│ │ └ <function BaseProcess.start at 0x76d9bc954ca0>
│ └ <_Process(Beat, started)>
└ <step: Beat>
File "/.venv/lib/python3.10/site-packages/billiard/process.py", line 124, in start
self._popen = self._Popen(self)
│ │ │ │ └ <_Process(Beat, started)>
│ │ │ └ <staticmethod(<function Process._Popen at 0x76d9bc956e60>)>
│ │ └ <_Process(Beat, started)>
│ └ None
└ <_Process(Beat, started)>
File "/.venv/lib/python3.10/site-packages/billiard/context.py", line 276, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
│ │ └ <_Process(Beat, started)>
│ └ <function DefaultContext.get_context at 0x76d9bc956f80>
└ <billiard.context.DefaultContext object at 0x76d9bc9a23e0>
File "/.venv/lib/python3.10/site-packages/billiard/context.py", line 333, in _Popen
return Popen(process_obj)
│ └ <_Process(Beat, started)>
└ <class 'billiard.popen_fork.Popen'>
File "/.venv/lib/python3.10/site-packages/billiard/popen_fork.py", line 24, in __init__
self._launch(process_obj)
│ │ └ <_Process(Beat, started)>
│ └ <function Popen._launch at 0x76d9b66b4af0>
└ <billiard.popen_fork.Popen object at 0x76d9b66c0eb0>
File "/.venv/lib/python3.10/site-packages/billiard/popen_fork.py", line 79, in _launch
code = process_obj._bootstrap()
│ └ <function BaseProcess._bootstrap at 0x76d9bc955900>
└ <_Process(Beat, started)>
File "/.venv/lib/python3.10/site-packages/billiard/process.py", line 327, in _bootstrap
self.run()
│ └ <function _Process.run at 0x76d9b6625000>
└ <_Process(Beat, started)>
File "/.venv/lib/python3.10/site-packages/celery/beat.py", line 707, in run
self.service.start(embedded_process=True)
│ │ └ <function Service.start at 0x76d9b6624af0>
│ └ <celery.beat.Service object at 0x76d9b6f6aa10>
└ <_Process(Beat, started)>
File "/.venv/lib/python3.10/site-packages/celery/beat.py", line 622, in start
humanize_seconds(self.scheduler.max_interval))
│ │ └ <kombu.utils.objects.cached_property object at 0x76d9b6f69600>
│ └ <celery.beat.Service object at 0x76d9b6f6aa10>
└ <function humanize_seconds at 0x76d9bc3bf2e0>
File "/.venv/lib/python3.10/site-packages/kombu/utils/objects.py", line 44, in __get__
value = obj.__dict__[self.__name__] = self.__get(obj)
│ │ │ │ │ └ <celery.beat.Service object at 0x76d9b6f6aa10>
│ │ │ │ └ <kombu.utils.objects.cached_property object at 0x76d9b6f69600>
│ │ │ └ 'scheduler'
│ │ └ <kombu.utils.objects.cached_property object at 0x76d9b6f69600>
│ └ <attribute '__dict__' of 'Service' objects>
└ <celery.beat.Service object at 0x76d9b6f6aa10>
File "/.venv/lib/python3.10/site-packages/celery/beat.py", line 666, in scheduler
return self.get_scheduler()
│ └ <function Service.get_scheduler at 0x76d9b6624ca0>
└ <celery.beat.Service object at 0x76d9b6f6aa10>
File "/.venv/lib/python3.10/site-packages/celery/beat.py", line 657, in get_scheduler
return symbol_by_name(self.scheduler_cls, aliases=aliases)(
│ │ │ └ {}
│ │ └ 'celery.beat:PersistentScheduler'
│ └ <celery.beat.Service object at 0x76d9b6f6aa10>
└ <function symbol_by_name at 0x76d9bc9be170>
File "/.venv/lib/python3.10/site-packages/celery/beat.py", line 501, in __init__
Scheduler.__init__(self, *args, **kwargs)
│ │ │ │ └ {'app': <Celery __main__ at 0x76d9bc921240>, 'schedule_filename': 'celerybeat-schedule', 'max_interval': 0, 'lazy': False}
│ │ │ └ ()
│ │ └ <celery.beat.PersistentScheduler object at 0x76d9bc9a22f0>
│ └ <function Scheduler.__init__ at 0x76d9b6783400>
└ <class 'celery.beat.Scheduler'>
File "/.venv/lib/python3.10/site-packages/celery/beat.py", line 257, in __init__
self.setup_schedule()
│ └ <function PersistentScheduler.setup_schedule at 0x76d9b66245e0>
└ <celery.beat.PersistentScheduler object at 0x76d9bc9a22f0>
> File "/.venv/lib/python3.10/site-packages/celery/beat.py", line 519, in setup_schedule
self._store = self._open_schedule()
│ │ │ └ <function PersistentScheduler._open_schedule at 0x76d9b66244c0>
│ │ └ <celery.beat.PersistentScheduler object at 0x76d9bc9a22f0>
│ └ None
└ <celery.beat.PersistentScheduler object at 0x76d9bc9a22f0>
File "/.venv/lib/python3.10/site-packages/celery/beat.py", line 509, in _open_schedule
return self.persistence.open(self.schedule_filename, writeback=True)
│ │ │ │ └ 'celerybeat-schedule'
│ │ │ └ <celery.beat.PersistentScheduler object at 0x76d9bc9a22f0>
│ │ └ <function open at 0x76d9bbd248b0>
│ └ <module 'shelve' from '/root/.local/share/uv/python/cpython-3.10.19-linux-x86_64-gnu/lib/python3.10/shelve.py'>
└ <celery.beat.PersistentScheduler object at 0x76d9bc9a22f0>
File "/root/.local/share/uv/python/cpython-3.10.19-linux-x86_64-gnu/lib/python3.10/shelve.py", line 243, in open
return DbfilenameShelf(filename, flag, protocol, writeback)
│ │ │ │ └ True
│ │ │ └ None
│ │ └ 'c'
│ └ 'celerybeat-schedule'
└ <class 'shelve.DbfilenameShelf'>
File "/root/.local/share/uv/python/cpython-3.10.19-linux-x86_64-gnu/lib/python3.10/shelve.py", line 227, in __init__
Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
│ │ │ │ │ │ │ │ └ True
│ │ │ │ │ │ │ └ None
│ │ │ │ │ │ └ 'c'
│ │ │ │ │ └ 'celerybeat-schedule'
│ │ │ │ └ <function open at 0x76d9b66b44c0>
│ │ │ └ <module 'dbm' from '/root/.local/share/uv/python/cpython-3.10.19-linux-x86_64-gnu/lib/python3.10/dbm/__init__.py'>
│ │ └ <shelve.DbfilenameShelf object at 0x76d9bc923670>
│ └ <function Shelf.__init__ at 0x76d9bbd24a60>
└ <class 'shelve.Shelf'>
File "/root/.local/share/uv/python/cpython-3.10.19-linux-x86_64-gnu/lib/python3.10/dbm/__init__.py", line 91, in open
raise error[0]("db type is {0}, but the module is not "
└ (<class 'dbm.error'>, <class 'OSError'>)
dbm.error: db type is dbm.gnu, but the module is not available
2026-02-06 16:37:29.677 | ERROR | celery.worker.consumer.consumer:_error_handler:428 - consumer: Cannot connect to amqp://rabbit-username:**@rabbit:5672//: [Errno 111] Connection refused.
Trying again in 2.00 seconds... (1/100)
2026-02-06 16:37:31.690 | ERROR | celery.worker.consumer.consumer:_error_handler:428 - consumer: Cannot connect to amqp://rabbit-username:**@rabbit:5672//: [Errno 111] Connection refused.
Trying again in 4.00 seconds... (2/100)
2026-02-06 16:37:35.697 | INFO | celery.worker.consumer.connection:start:24 - Connected to amqp://rabbit-username:**@rabbit:5672//
2026-02-06 16:37:35.702 | INFO | celery.worker.consumer.mingle:sync:43 - mingle: searching for neighbors
2026-02-06 16:37:36.722 | INFO | celery.worker.consumer.mingle:sync:46 - mingle: sync with 1 nodes
2026-02-06 16:37:36.723 | INFO | celery.worker.consumer.mingle:sync:50 - mingle: sync complete
2026-02-06 16:37:36.735 | INFO | celery.apps.worker:on_consumer_ready:161 - site-worker@34e82492e0fe ready.
2026-02-06 16:37:39.238 | INFO | celery.worker.control:enable_events:281 - Events of group {task} enabled by remote.
2026-02-06 16:40:35.369 | INFO | celery.worker.strategy:task_message_handler:157 - Received task: competitions.tasks.unpack_competition[9c4f8861-380f-4f96-b797-9f960085425b]
2026-02-06 16:40:35.371 | INFO | competitions.tasks:unpack_competition:338 - Starting unpack with status pk = 19892
2026-02-06 16:40:35.400 | INFO | competitions.tasks:unpack_competition:357 - Download competition bundle: dataset/2026-02-06-1770396035/2c5c72900b8e/iris3.zip
[...]As you can see, I tried uploading a competition bundle and everything worked without any error. This error only occurs once, as the file gets deleted and the one is created on the disk. |
52a4816 to
dec7b41
Compare
A brief description of the purpose of the changes contained in this PR.
packaging/container/Containerfile*almalinux:10-minimal, making images much smaller (1.4 GB to 400 MB for site worker and django), faster to build. This also reduce the number of packages in the image, reducing CVEsDockerfile.builder(nowContainerfile.builder) base image to latest stable usable versiondocker-compose.ymlOverall, this PR reduces CVEs, reduces image size on disk, makes the images build much faster (less than 1 minute compared to the current 2+ mins) and makes the circle-ci test finish in around 7 min 50s compared to the current 9 min 50s runtime.
It also introduces UV as a package manager for the compute worker and the project itself, which will make future package upgrades easier, as well as managing the Python version needed to install the packages.
Closes #2064
Closes #2063
Checklist