Random Topics: 001
I've been engaging in several disparate activities as of late, none of which are mature enough to be developed into their own individual article. Instead, like one great big spring mix salad, I'm going to amalgamate each of them into this one article covering the tiny pursuits that have captured my interests recently.
Python Hackery
ipdb can be considered as a drop-in replacement for the standard pdb debugger, incorporating syntax highlighting and other convenient IPython features to brighten up your debugging experience. I wanted the breakpoint
builtin to call ipdb instead of pdb. We can start by obtaining the directory that Python uses to stores external modules.
import site
print(site.getsitepackages()) # output: ['/usr/lib/python3.12/site-packages']
What's interesting about this directory is that the Python interpreter loads the sitecustomize.py
file in that directory on start. So, we can populate this file with some code that overrides the breakpoint
builtin with a custom function that launches ipdb instead.
def f():
import ipdb
import sys
ipdb.__main__._init_pdb().set_trace(sys._getframe().f_back)
import builtins
builtins.__dict__['breakpoint'] = f
The code might look daunting, especially with the weird way I'm launching ipdb. I'm don't remember why exactly, but the conventional way of calling ipdb had some peculiar detail that I disliked—so I sifted through the source code, and found ipdb/ipdb/__main__.py
. It contains the internals for launching ipdb and by skipping some of the steps we can tell ipdb to ignore the sitecustomize.py
file and jump straight into the code listing in the actual program code.
def _init_pdb(context=None, commands=[]):
if context is None:
context = os.getenv("IPDB_CONTEXT_SIZE", get_context_from_config())
debugger_cls = _get_debugger_cls()
try:
p = debugger_cls(context=context)
except TypeError:
p = debugger_cls()
p.rcLines.extend(commands)
return p
def wrap_sys_excepthook():
# make sure we wrap it only once or we would end up with a cycle
# BdbQuit_excepthook.excepthook_ori == BdbQuit_excepthook
if sys.excepthook != BdbQuit_excepthook:
BdbQuit_excepthook.excepthook_ori = sys.excepthook
sys.excepthook = BdbQuit_excepthook
def set_trace(frame=None, context=None, cond=True):
if not cond:
return
wrap_sys_excepthook()
if frame is None:
frame = sys._getframe().f_back
p = _init_pdb(context).set_trace(frame)
if p and hasattr(p, "shell"):
p.shell.restore_sys_module_state()
Anyway, I also have export PYTHONSTARTUP=~/.pyrc.py
set to import commonly used packages like numpy in the Python interpreter.
Gaze Estimation
I noticed that when I move my mouse cursor to an UI element, my eye gaze naturally moves to that target location first. So it would be very convenient and fluid to have my mouse cursor snap to my eye gaze location. I looked for existing solutions, but most required special hardware while I wanted to accomplish this with just my consumer grade webcam.
I'm not going to go into detail here, because I'm still working on this project which will hopefully be robust enough in the near future for me to write an article on it. I trained a CNN on the MPIIFaceGaze dataset and it worked, just not well. Very low accuracy and head movements throw it off. I'm reading more academic papers on this subject and investigating this master's student's GitHub repo on the subject.
Exposed Machine Learning Services
On the subject of machine learning, 404 Media had an article showing how many internal AI training tools are exposed out on the Internet for everyone to see. Indeed this is the case; I used hunter.how (poor man's Shodan) to search for a couple of these instances. Not very exciting, nothing important in them.
Mathematics
John D. Cook has an amazing twitter account Algebra Etc. that usually posts tantalizingly strange mathematical facts. He maintains other accounts on topology and logic, but those are less in tune with my preferences. From the algebra account I met an interesting inequality, which had me looking into Ravi substitution. The idea is that $a,b,c$ are the sides to a non-degenerate triangle iff $a+b>c$, $a+c>b$, and $b+c>a$. Hence, if the problem specifies that the variables to an inequality form the sides of a triangle, we can subsitute them like so: $a = x+y$, $b=y+z$, $c=z+x$.
I was also casually browsing graph theory articles on Wikipedia. Learned about perfect matchings and edge spaces. It's quite interesting, how you can think of an incidence matrix as a linear transformation from the edge space to the vertex space and the adjacency matrix as a linear operator on the vertex space. Read a presentation on how many different graph operations can be thought of as matrix multiplications.
I've been reading Benjamin C. Pierce's book Basic Category Theory for Computer Scientists. It's quite mind numbing. It shows how individual monoids can be thought of as a category with a single object where the arrows are elements and composition acts as the binary operation—in fact, add the requirement that each arrow is an isomorphism, and you get a group. But also the collection of all monoids and homomorphisms between them is a category in and of itself. Also, apparently the concepts of injectivity and surjectivity permit neat (with definitions quite symmetrical) generalizations as mono- and epi- morphisms.
Mixed Boolean-Arithmetic
I was reading about concolic analysis, then stumbled onto some subjects about code obfuscation. The idea of MBA (Mixed Boolean-Arithmetic) is that we can make really complicated appearing identities by expressing operations as the linear combination of bitwise operations. For example, instead of writing x+y
we can write (x^y)+2*(x&y)
, which is equivalent.
I read a couple interesting papers relating to this. Rivest of RSA fame wrote a paper about permutation polynomials over finite rings in 2001, characterizing the conditions in which they map bijectively. I might write an article on this in the future, when I investigate this subject in more depth. It's quite interesting.
Katana - A Lightweight Templating Engine
I clobbered together some code one evening to replace Nunjucks as the templating engine powering the website you're reading on right now. Sticking with the Japanese theme, I nickname it "Katana." It's only 177 lines of JS code—very lean, very fast, very flexible.
The parsing is done recursively "by feel." It segments a template document into blocks. To render, the code "implants" the document into the layout/parent which it inherits from, then reduces each block to the appropriate content.
An interesting feature is that Katana permits complex JS expressions such as posts.slice(0,9)
because it evaluates these in a separate node VM context. The expression can't have white space though, because the tokenization is stupid.
const vm = require('node:vm');
function evaluate(expr, context) {
const c = vm.createContext(context);
return vm.runInContext(expr, c);
}
Yes, I still use CJS for this website, because many components break whenever I attempt to shuffle everything to ESM.
Ida & Ghidra
I've been playing around with RE and binary exploitation recently. I found a cracked/leaked version of Ida 9.0 Pro beta a while ago and have been experimenting with that. Nevertheless, I remain attached to Ghidra. I've been not so successfully attempting to learn Angr, this automatic concolic analysis tool. So confusing.