Is it possible to declare a local variable before it's imported?
For example to get this code to run as expected:
# a.py
# do magic here to make b.foo = "bar"
import b
b.printlocal()
# b.py
# imagine this file cannot be changed (e.g., for dumb political reasons)
local_var = foo
def printlocal():
print(local_var)
Running a.py should print "bar". How can this be accomplished without changing b.py?
What I've Tried so Far
A. Patching
from unittest.mock import patch
with patch("b.foo"):
import b
b.printlocal()
Result: NameError: name 'foo' is not defined
Thoughts: I think this doesn't work because the module is patched after the import completes.
B. Namespace manipulation
# a.py
import sys
module_name = "b"
module_globals = sys.modules[module_name].__dict__
module_globals["foo"] = "bar"
import b
b.printlocal()
Result: KeyError: 'b'
Thoughts: This fails because the module cannot be modified before it's available in sys.modules. I haven't tried this directly but I think if the module was manually created with the locals it wouldn't work either as a module wouldn't import a second time.
C. Using locals import
# a.py
import importlib
b = importlib.__import__(
"b",
locals={"__builtins__": __builtins__, "foo": "bar"},
# I also tried using globals here with same error
)
b.printlocal()
Result: NameError: name 'foo' is not defined
Thought: I think this fails because import globals and locals are only used for importing the file. Documentation on this function is sparse.
Motivation: Why attempt this "unpythonic" monstrosity?
Databricks notebooks have spark and dbutils variables at the top level so there are hundreds (just in my repos) of python files that cannot be imported making unit testing difficult. I would love to be able to initialize these as MagicMocks while importing to make testing possible and to avoid side-effects.
In the example above foo is a stand-in for spark and dbutils.b.py is a notebook which contains some spark code with side effects and also functions which should be tested. printlocal() is a stand-in for a function to be tested. Here's a quick example:
# Databricks notebook source
from pyspark.sql.functions import to_timestamp, concat
curr_date = dbutils.widgets.get("CurrentDate") # will fail here if imported directly "dbutils not defined"
env = dbutils.widgets.get("Environment")
def dateStringToTimestampExpression(date_str): # want to unittest this in a.py
return to_timestamp(concat(col(date_str), "yyyy-MM-dd")
df = spark.table.read(f"{env}.table_a")
df = df.withColumn("updated_on", dateStringToTimestampExpression(curr_date))
df.write.saveAsTable("table_a_enhanced") # undesirable side-effect