Conversation
suhaibmujahid
left a comment
There was a problem hiding this comment.
Thank you for taking care of this, and sorry for the late review! Please take a look at my comments.
| utils.upload_s3([diff_zst_path]) | ||
| else: | ||
| diff_errors += 1 | ||
| for future in tqdm(as_completed(futures), totla=len(futures)): |
There was a problem hiding this comment.
Shouldn't we use as_completed() within the ThreadPoolExecutor() context?
| return None | ||
|
|
||
|
|
||
| def process_diff(bug_id, obj, upload, repo_path): |
There was a problem hiding this comment.
I would add a docstring to explain the returned value since it is not straightforward and could be confusing. Also, I would add a type hint for it as well.
| return None | ||
|
|
||
|
|
||
| def process_diff(bug_id, obj, upload, repo_path): |
There was a problem hiding this comment.
Using a descriptive function name would be helpful.
| with open(diff_path, "wb") as f: | ||
| f.write(diff) | ||
|
|
||
| utils.zstd_compress(diff_path) | ||
|
|
||
| os.remove(diff_path) | ||
|
|
||
| if upload: | ||
| utils.upload_s3([diff_zst_path]) |
There was a problem hiding this comment.
It will return None in this path, but the caller expects tuple(int, int).
| with open(diff_path, "wb") as f: | ||
| f.write(diff) | ||
|
|
||
| utils.zstd_compress(diff_path) | ||
|
|
||
| os.remove(diff_path) | ||
|
|
||
| if upload: | ||
| utils.upload_s3([diff_zst_path]) |
There was a problem hiding this comment.
I would keep the happy path to the lift where possible, that is, I would flip the condition and return early for the falsy value, i.e., (0, 1). This way, we won't need to use else. That was not possible with the previous implementation, but now that the logic is wrapped in a function, it is better to do so.
Description
I added threading to generate diff to make it run concurrently,. The design is same retrieve_logs. The only difference is that we returen a tuple (mapping_erros, diff_errors) and add them in the end for the total.
Linked PRS
Closes #5400