This rule raises an issue when the identity operator is used with cached literals.
The identity operators is and is not check if the same object is on both sides, i.e. a is b returns
True if id(a) == id(b).
The CPython interpreter caches certain built-in values for integers, bytes, floats, strings, frozensets and tuples. When a value is cached, all its references are pointing to the same object in memory; their ids are identical.
The following example illustrates this caching mechanism:
my_int = 1 other_int = 1 id(my_int) == id(other_int) # True
In both assignments (to my_int and other_int), the assigned value 1 comes from the interpreter cache, only
one integer object 1 is created in memory. This means both variables are referencing the same object. For this reason, their ids are
identical and my_int is other_int evaluates to True. This mechanism allows the interpreter for better performance, saving
memory space, by not creating new objects every time for commonly accessed values.
However this caching mechanism does not apply to every value:
my_int = 1000 id(my_int) == id(1000) # False my_int is 1000 # False
In this example the integer 1000 is not cached. Each reference to 1000 creates an new integer object in memory with a new
id. This means that my_int is 1000 is always False, as the two objects have different ids.
This is the reason why using the identity operators on integers, bytes, floats, strings, frozensets and tuples is unreliable as the behavior changes depending on the value.
Moreover the caching behavior is not part of the Python language specification and could vary between interpreters. CPython 3.8 warns about comparing literals using identity operators.
This rule raises an issue when at least one operand of an identity operator:
int, bytes, float, frozenset or tuple. If you need to compare these types you should use the equality operators instead == or !=.
The only case where the is operator could be used with a cached type is with "interned" strings. The Python interpreter provides a way
to explicitly cache any string literals and benefit from improved performances, such as:
This explicit caching is done through interned strings (i.e. sys.intern("some string")).
from sys import intern
my_text = "text"
intern("text") is intern(my_text) # True
Note however that interned strings don’t necessarily have the same identity as string literals.
It is also important to note that interned strings may be garbage collected, so in order to benefit from their caching mechanism, a reference to the interned string should be kept.
Use the equality operators (== or !=) to compare int, bytes, float,
frozenset, tuple and string literals.
my_int = 2000 my_int is 2000 # Noncompliant: the integer 2000 may not be cached, the identity operator could return False. () is tuple() # Noncompliant: this will return True only because the CPython interpreter cached the empty tuple. (1,) is tuple([1]) # Noncompliant: comparing non empty tuples will return False as none of these objects are cached.
my_int = 2000 my_int == 2000 () == tuple() (1,) == tuple([1])