Categories
flask python thread-safety

Are global variables thread-safe in Flask? How do I share data between requests?

145

In my application, the state of a common object is changed by making requests, and the response depends on the state.

class SomeObj():
    def __init__(self, param):
        self.param = param
    def query(self):
        self.param += 1
        return self.param

global_obj = SomeObj(0)

@app.route("https://stackoverflow.com/")
def home():
    flash(global_obj.query())
    render_template('index.html')

If I run this on my development server, I expect to get 1, 2, 3 and so on. If requests are made from 100 different clients simultaneously, can something go wrong? The expected result would be that the 100 different clients each see a unique number from 1 to 100. Or will something like this happen:

  1. Client 1 queries. self.param is incremented by 1.
  2. Before the return statement can be executed, the thread switches over to client 2. self.param is incremented again.
  3. The thread switches back to client 1, and the client is returned the number 2, say.
  4. Now the thread moves to client 2 and returns him/her the number 3.

Since there were only two clients, the expected results were 1 and 2, not 2 and 3. A number was skipped.

Will this actually happen as I scale up my application? What alternatives to a global variable should I look at?

0

    141

    You can’t use global variables to hold this sort of data. Not only is it not thread safe, it’s not process safe, and WSGI servers in production spawn multiple processes. Not only would your counts be wrong if you were using threads to handle requests, they would also vary depending on which process handled the request.

    Use a data source outside of Flask to hold global data. A database, memcached, or redis are all appropriate separate storage areas, depending on your needs. If you need to load and access Python data, consider multiprocessing.Manager. You could also use the session for simple data that is per-user.


    The development server may run in single thread and process. You won’t see the behavior you describe since each request will be handled synchronously. Enable threads or processes and you will see it. app.run(threaded=True) or app.run(processes=10). (In 1.0 the server is threaded by default.)


    Some WSGI servers may support gevent or another async worker. Global variables are still not thread safe because there’s still no protection against most race conditions. You can still have a scenario where one worker gets a value, yields, another modifies it, yields, then the first worker also modifies it.


    If you need to store some global data during a request, you may use Flask’s g object. Another common case is some top-level object that manages database connections. The distinction for this type of “global” is that it’s unique to each request, not used between requests, and there’s something managing the set up and teardown of the resource.

    0

      51

      This is not really an answer to thread safety of globals.

      But I think it is important to mention sessions here.
      You are looking for a way to store client-specific data. Every connection should have access to its own pool of data, in a threadsafe way.

      This is possible with server-side sessions, and they are available in a very neat flask plugin: https://pythonhosted.org/Flask-Session/

      If you set up sessions, a session variable is available in all your routes and it behaves like a dictionary. The data stored in this dictionary is individual for each connecting client.

      Here is a short demo:

      from flask import Flask, session
      from flask_session import Session
      
      app = Flask(__name__)
      # Check Configuration section for more details
      SESSION_TYPE = 'filesystem'
      app.config.from_object(__name__)
      Session(app)
      
      @app.route("https://stackoverflow.com/")
      def reset():
          session["counter"]=0
      
          return "counter was reset"
      
      @app.route('/inc')
      def routeA():
          if not "counter" in session:
              session["counter"]=0
      
          session["counter"]+=1
      
          return "counter is {}".format(session["counter"])
      
      @app.route('/dec')
      def routeB():
          if not "counter" in session:
              session["counter"] = 0
      
          session["counter"] -= 1
      
          return "counter is {}".format(session["counter"])
      
      
      if __name__ == '__main__':
          app.run()
      

      After pip install Flask-Session, you should be able to run this. Try accessing it from different browsers, you’ll see that the counter is not shared between them.

      1

      • What if I want to share more complex data like data frames or numpy arrays? Does the session dictionary is still usable? I remember having a json serialize error. Can this be addressed, too? Thanks for this answer btw.

        – Blind0ne

        Jul 18 at 13:17


      14

      Another example of a data source external to requests is a cache, such as what’s provided by Flask-Caching or another extension.

      1. Create a file common.py and place in it the following:
      from flask_caching import Cache
      
      # Instantiate the cache
      cache = Cache()
      
      1. In the file where your flask app is created, register your cache with the following code:
      # Import cache
      from common import cache
      
      # ...
      app = Flask(__name__)
      
      cache.init_app(app=app, config={"CACHE_TYPE": "filesystem",'CACHE_DIR': Path('/tmp')})
      
      1. Now use throughout your application by importing the cache and executing as follows:
      # Import cache
      from common import cache
      
      # store a value
      cache.set("my_value", 1_000_000)
      
      # Get a value
      my_value = cache.get("my_value")
      

      0