Authentication API

Summary

We make use of session cookies, a local cache and some logic flow to ensure users can work easily with minimal interruptions

rhaptos2.repo.auth

:module:`auth` supplies the logic to authenticate users and then store that authentication for later requests. It is strongly linked with :module:`sessioncache`.

Future revisions will pull out the authentication chunk, to be replaced with user-profile-auth, however the logic over authorisation will remain.

Overview

auth.handle_user_authentication is called on before_requset and will ensure we end up with a verified user in a session or the user is unable to authenticate

after_authentication is called by the openid or similar machinery, to trigger the session cache mgmt

known issues

requesting_user_uri This is passed around a lot This is suboptimal, and I think should be replaced with passing around the environ dict as a means of linking functions with the request calling them

I am still passing around the userd in g. This is fairly silly but seems consistent for flask. Will need rethink.

secure (https) - desired future toggle

further notes at http://executableopinions.mikadosoftware.com/en/latest/labs/webtest-cookie/cookie_testing.html

userdict example:

{"interests": null, "identifiers": [{"identifierstring": "https://michaelmulich.myopenid.com", "user_id": "cnxuser:75e06194-baee-4395-8e1a-566b656f6924", "identifiertype": "openid"}], "user_id": "cnxuser:75e06194-baee-4395-8e1a-566b656f6924", "suffix": null, "firstname": null, "title": null, "middlename": null, "lastname": null, "imageurl": null, "otherlangs": null, "affiliationinstitution_url": null, "email": null, "version": null, "location": null, "recommendations": null, "preferredlang": null, "fullname": "Michael Mulich", "homepage": null, "affiliationinstitution": null, "biography": null}
rhaptos2.repo.auth.add_location_header_to_response(fn)[source]

add Location: header

from: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html For 201 (Created) responses, the Location is that of the new resource which was created by the request

decorator that assumes we are getting a flask response object

rhaptos2.repo.auth.after_authentication(authenticated_identifier, method)[source]

Called after a user has provided a validated ID (openid or peresons)

This would be an endpoint in Valruse.

method either openid, or persona

Parms :authenticated_identifier

pass on responsibility to

rhaptos2.repo.auth.apply_cors(resp)[source]
rhaptos2.repo.auth.asjson(pyobj)[source]

just placeholder

>>> x = {'a':1}
>>> asjson(x)
'{"a": 1}'
rhaptos2.repo.auth.authenticated_identifier_to_registered_user_details(ai)[source]

Given an authenticated_identifier (ai) request full user details from the user service

returns dict of userdetails (success),
None (user not registerd) or error (user service down).
rhaptos2.repo.auth.callstatsd(dottedcounter)[source]
rhaptos2.repo.auth.create_session(userdata)[source]

A closure function that is stored and called at end of response, allowing us to set a cookie, with correct uuid, before response obj has been created (before request is processed !)

Param :userdata - a userdict format.
Returns:sessionid

cookie settings:

  • cnxsessionid - a fixed key string that is constant
  • expires - we want a cookie that will live even if user

shutsdown browser. However do not live forever ...? * httponly - prevent easy CSRF, however allow AJAX to request browser to send cookie.

rhaptos2.repo.auth.create_temp_user(identifiertype, identifierstring)[source]

We should ping to user service and create a temporary userid linked to a made up identifier. This can then be linked to the unregistered user when they finally register.

FIXME - needs to actually talk to userservice. THis is however a asynchronous problem, solve under session id

rhaptos2.repo.auth.delete_session(sessionid)[source]

request browser temove cookie from client, remove from session-cache dbase.

rhaptos2.repo.auth.handle_user_authentication(flask_request)[source]

Correctly perform all authentication workflows

We have 16 options for different user states such as IsLoggedIn, NotRegistered. The states are listed below.

THis function is where eventually all 16 will be handled. For the moment only a limited number are.

Parameters:flask_request – request object of pococo flavour.
Returns:No return is good because it allows the onward rpocessing of requests.

Otherwise we return a login page.

This gets called on before_request (which is after processing of HTTP headers but before __call__ on wsgi.)

Note

All the functions in sessioncache, and auth, should be called from here (possibly in a chain) and raise errors or other signals to allow this function to take action, not to presume on some action (like a redirect) themselves. (todo-later: such late decisions are well suited for deferred callbacks)

Auth Reg InSession ProfileCookie Next Action / RoleType Handled Here
Y Y Y Y Go Y
Y Y Y N set_profile_cookie Y
Y Y N Y set_session Y
Y Y N N FirstTimeOK  
Y N Y Y ErrorA  
Y N Y N ErrorB  
Y N N Y ErrorC  
Y N N N NeedToRegister  
N N Y Y AnonymousGo  
N N Y N set_profile_cookie  
N N N Y LongTimeNoSee  
N N N N FreshMeat  
N Y Y Y Conflict with anonymous and reg?  
N Y Y N Err-SetProfile-AskForLogin  
N Y N Y NotArrivedYet  
N Y N N CouldBeAnyone  

All the final 4 are problematic because if the user has not authorised how do we know they are registered? Trust the profile cookie?

we examine the request, find session cookie, register any logged in user, or redirect to login pages

rhaptos2.repo.auth.lookup_session(sessid)[source]

As this will be called on every request and is a network lookup we should storngly look into redis-style lcoal disk cacheing performance monitoring of request life cycle?

returns python dict of user_dict format.
or None if no session ID in cache or Error if lookup failed for other reason.
rhaptos2.repo.auth.redirect_to_login()[source]

On first hitting the site, the user will have no cookie If we issued a 301, the browser would issue another request, which would have no cookie, which would issue a 301...

By presenting this HTML when the user hits the login server, we avoid this. Clearly templating is needed.

rhaptos2.repo.auth.session_to_user(flask_request_cookiedict, flask_request_environ)[source]

Given a request environment and cookie

>>> cookies = {"cnxsessionid": "00000000-0000-0000-0000-000000000000",}
>>> env = {}
>>> userd = session_to_user(cookies, env)
>>> outenv["fullname"]
'pbrian'
Params flask_request_cookiedict:
 the cookiejar sent over as a dict(-like obj).
Params flask_request_environ:
 a dict like object representing WSGI environ
Returns:Err if lookup fails, userdict if not
rhaptos2.repo.auth.set_autosession()[source]

This is a convenience function for development It should fail in production

rhaptos2.repo.auth.set_temp_session()[source]

A temopriary session is not yet fully implemented A temporary session is to allow a unregistered and unauthorised user to vist the site, acquire a temporary userid and a normal session.

Then they will be able to work as normal, the workspace and acls set to the just invented temporary id.

However work saved will be irrecoverable after session expires...

rhaptos2.repo.auth.setup_auth()[source]

As part of drive to remove app setup from the import process, have moved to calls into this function. This is driving a circular import cycle, which while temp solved will only be fixed by changing logging process.

So to ensure docs work, and as a nod towards cleaning up the import-time work happening here, this needs to be called by run.

rhaptos2.repo.auth.store_userdata_in_request(userd, sessionid)[source]

given a userdict, keep it in the request cycle for later reference. Best practise here will depend on web framework.

rhaptos2.repo.auth.userspace()[source]
rhaptos2.repo.auth.whoami()[source]

based on session cookie returns userd dict of user details, equivalent to mediatype from service / session

rhaptos2.repo.sessioncache

sessioncache is a standalone module providing the ability to control persistent-session client cookies and profile-cookies.

sessioncache.py is a “low-level” piece, and is expected to be used in conjunction with lower-level authentication systems such as OpenID and with “higher-level” authorisation systems such as the flow-control in auth.py

persistent-session
This is the period of time during which a web server will accept a id-number presented as part of an HTTP request as a replacement for an actual valid form of authentication. (we remember that someone authenticated a while ago, and assume no-one is able to impersonate them in the intervening time period)
persistent-session cookie
This is a cookie set on a client browser that stores a id number pertaining to a persistant-session. It will last beyond a browser shutdown, and is expected to be sent as a HTTP header as part of each request to the server.

Why? Because I was getting confused with lack of fine control over sessions and because the Flask implementation relied heavily on encryption which seems to be the wrong direction. So we needed a server-side session cookie impl. with fairly fine control.

I intend to replace the existing SqlAlchemy based services with pure psycopg2 implementations, but for now I will be content not adding another feature to SA

Session Cache

The session cache needs to be a fast, distributed lookup system for matching a random ID to a dict of user details.

We shall store the user details in the tabl;e session_cache

Discussion

Caches are hard. They need to be very very fast, and in this case distributable. Distributed caches are very very hard because we need to ensure they are synched.

I feel redis makes an excellent cache choice in many circumstances - it is blazingly fast for key-value lookups, it is simple, it is threadsafe (as in threads in the main app do not maintain any pooling or thread issues other than opening a socket or keeping it open) and it has decent synching options.

However the synching is serious concern, and as such using a centralised, fast, database will allow us to move to production with a secure solution, without the immediate reliance on cache-invalidation strategies.

Overview

We have one single table, session_cache. This stores a json string (as a string, not 9.3 JSON type) as value in a key value pair. The key is a UUID-formatted string, passed in from the application. It is expected we will never see a collission.

We have three commands:

With this we can test the whole lifecyle as below

Example Usage

We firstly pass in a badly formed id.:

>>> sid = "Dr. Evil"
>>> get_session(sid)
Traceback (most recent call last):
...

Rhaptos2Error: Incorrect UUID format for sessionid...

OK, now lets use a properly formatted (but unlikely) UUID

>>> sid = "00000000-0000-0000-0000-000000000001"
>>> set_session(sid, {"name":"Paul"})
True
>>> userd = get_session(sid)
>>> print userd[0]
00000000-0000-0000-0000-000000000001
>>> delete_session(userd[0])

To do

  • greenlets & conn pooling
  • wrap returned recordset in dict.
  • pg’s UUID type?

Standalone usage

minimalconfd = {"app": {'pghost':'127.0.0.1',
                        'pgusername':'repo',
                        'pgpassword':'CHANGEME',
                        'pgdbname':'dbtest'}
               }

import sessioncache
sessioncache.set_config(minimalconfd)
sessioncache.initdb()
sessioncache._fakesessionusers()
sessioncache.get_session("00000000-0000-0000-0000-000000000000")
{u'interests': None, u'user_id': u'cnxuser:75e06194-baee-4395-8e1a-566b656f6920', ...}
>>> 
rhaptos2.repo.sessioncache.connection_refresh(conn)[source]

Connections should be pooled and returned here.

rhaptos2.repo.sessioncache.delete_session(sessionid)[source]

Remve from session_cache an existing but no longer wanted session(id) for whatever reason we want to end a session.

Parameters:sessionid – Sessionid from cookie

:returns nothing if success.

rhaptos2.repo.sessioncache.exec_stmt(insql, params)[source]

trivial ability to run a dm query outside SQLAlchemy.

Parameters:
  • insql – A correctly parameterised SQL stmt ready for psycopg driver.
  • params – iterable of parameters to be inserted into insql
Return a dbapi recordset:
 

(list of tuples)

rhaptos2.repo.sessioncache.get_session(sessionid)[source]
Given a sessionid, if it exists, and is “in date” then
return userdict (oppostie of set_session)

Otherwise return None (We do not error out on id not found)

NB this depends heavily on co-ordinating the incoming TZ of the DB and the python app server - I am soley runnig the check on the dbase, which avoids that but does make it less portable.

rhaptos2.repo.sessioncache.getconn()[source]

returns a connection object based on global confd.

This is, at the moment, not a pooled connection getter.

We do not want the ThreadedPool here, as it is designed for “real” threads, and listens to their states, which will be ‘awkward’ in moving to greenlets.

We want a pool that will relinquish control back using gevent calls

https://bitbucket.org/denis/gevent/src/5f6169fc65c9/examples/psycopg2_pool.py http://initd.org/psycopg/docs/pool.html

Return psycopg2 connection objpsycopg2 connection obj:
 conn obj
Return psycopg2.Error:
 or Err
rhaptos2.repo.sessioncache.initdb()[source]

A helper function for creating the This should be in backend, but it was easier to submit one module only.

rhaptos2.repo.sessioncache.maintenance_batch()[source]

A holdng location for ways to clean up the session cache over time. These will need improvement and testing.

rhaptos2.repo.sessioncache.run_query(insql, params)[source]

trivial ability to run a query outside SQLAlchemy.

Parameters:
  • insql – A correctly parameterised SQL stmt ready for psycopg driver.
  • params – iterable of parameters to be inserted into insql
Return a dbapi recordset:
 

(list of tuples)

run_query(conn, “SELECT * FROM tbl where id = %s;”, (15,))

issues: lots.

  • No fetch_iterator.
  • connection per query(see above)
  • We should at least return a dict per row with fields as keys.
rhaptos2.repo.sessioncache.set_config(confd)[source]
rhaptos2.repo.sessioncache.set_session(sessionid, userd)[source]

Given a sessionid (generated according to cnxsessionid spec elsewhere) and a userdict store in session cache with appropriate timeouts.

Parameters:
  • sessionid – a UUID, that is to be the new sessionid
  • userd – python dict of format cnx-user-dict.
Returns:

True on successful setting.

Can raise Rhaptos2Errors

TIMESTAMPS. We are comparing the time now, with the expirytime of the cookie in the database This reduces the portability.

This beats the previous solution of passing in python formatted UTC and then comparing on database.

FIXME: bring comaprison into python for portability across cache stores.

rhaptos2.repo.sessioncache.validate_uuid_format(uuidstr)[source]

Given a string, try to ensure it is of type UUID.

>>> validate_uuid_format("75e06194-baee-4395-8e1a-566b656f6920")
True
>>> validate_uuid_format("FooBar")
False

Future Developments

  • registration on user service
  • API Tokens and user service
  • reliance by other services on user service logged in (single Sign on)

Project Versions

Table Of Contents

Previous topic

API for Views and models

Next topic

Common functionality

This Page