baal

Today I will be talking about the baal project, and what it entails.

The baal project can be found here, if you would like to contribute, please feel free to drop me an email containing your patch.

Now, you migh be asking yourself: "what exactly is baal", allow me to answer your question. baal (always written lowercase) was originally meant to be an implementation of my good friend Tristan (deavmi) Kildaire's json based email protocol butterfly, but then slowly morphed into my own take on a json based email protocol, and hence all credit for the idea should be directed towards deavmi.

Basics

Everything in baal is json based, config files, emails, etc. This keeps it simple and allows for easily extending config format without breaking anything else. i.e we keep backwords compatibility.

Likewise, persistant storage is also done as json files (Although I am tempted to use a database, if I do, I will use sqlite3, as baal is not inteded to have a large concurrent user base (max something like 32 users), as it is inteded to be used almost on a user by user basis, that is each user hosts thier own server.

baal query

First, everything that happens in baal is in the form of a baal query, for example.

  • Sending an email
  • Syncing mailboxes
  • Creating mailboxes

A baal query is a json messaged, prefixed with a VLI of 4 bytes, with the json being of the form:

{
	"query" : "BAAL_QUERY",
	"data"  : <Any json object>
}

We have 2 fields to unpack, namely

  • query
    • What type of query this is, for example: register.
  • data
    • The data needed to fulfill this query, varies from query to query.

More Queries

Since the documentation for baal is rather lacking, this blog post acts as the first offical documentation for baal, in the next post (Which will most likely be written tomorrow), I shall explain more of the currently implemented queries

Slightly long, tl;dr on baal

  • single binary (for both server and client).
    • That is, to run a server or client for the baal protocol, you just use a single (more specifically, the same) binary.
    • This is vastly simpler to setting up an email server, and should allow users to easily setup thier own baal mail servers.
  • open protocol (well it still needs to be written).
  • Multi user registration
  • Rich command set
    • Easily extensible, adding new commands consists of simply creating a starter function, which takes argc and argv (as a usual C prog does), and execs the command.
  • Thread safe
    • A baal server can handle up to n (Decided at compile time) concurrent connections, and all operations within these threads are (as of writing this) thread safe.
  • Server query hooks
    • The server can easily be extended, as the handling of incoming requests consists of passing the incoming packet to the approriate hook, hence adding new features requires 2 simple changes:
      • Adding a function which execs the hook
      • Adding a mapping of the command string in the request json to the hook function
    • Examples of hooks getting exec'd can be found here.

Actual tl;dr

baal = json + sockets = email server and or client

~ Skiqqy

Migration

This post will be short, and simply highlight the changes that occured in the migration

So I finally migrated from ghost to my own implementation of a blog, in its current form, my blog is a combination of 3 things, namely

  • bash
    • Handles post creation && assists in building the final blog, More specically, this script.
  • mdbook
    • Handles the "heavy" lifting of creating the website from several markdown files
  • gwiki
    • Handles hosting and rebuilding if changes are pushed to the project repository

Thus my blog has the appearance of a static website, but with changes pushed to the project repository, these changes are detected, and the project is rebuilt and deployed. Thus giving the feel of a "live" site.

This new blog also includes an rss feed.

Creating a Post

Creating a new post is rather simple, I do something like

$ ./blog post TITLE

This opens the new post in my $EDITOR, and I can then write my post, then I simply

$ git add . && git commit -m 'New Post' && git push # Add latest post and push

This will then cause the gwiki instance to deploy the changes.

And thats that, everything else remains the same (Post content)

~ Skiqqy

Shell Scripting

So since most of the shell scripts I write don't really deserve an entire post dedicated to them, I've decided to write one post detailing the Interesting scripts I have written (and my love for shell scripting!). Most of these helper scripts can be found in my bin.

Although I do love POSIX, admittedly not all of my scripts are POSIX compliant, the main reason for this is I really enjoy some of the features bash offers, such as

Just to name a few!

Other than what is used above, my scripts are POSIX compliant, hence if I want to ensure I write a POSIX script I basically just have to avoid using the above. With that being said I have rarely actually needed to write a POSIX script as I don't really need the portability it offers, and hence have relished in the sins of using these bash features.

Note, new scripts get added as they are made.

Now lets get into the scripts!

pdf

This simple utility, humbly named pdf aims to solve all problems I faced with when dealing with pdf's. Namely compiling, reading, etc! roughly speaking depending on the type of file given as the last argument to pdf dictates what we are doing. Below are examples.

$ pdf document.pdf # Open document.pdf in mupdf with a custom dark filter

$ pdf document.tex # Compile the LaTeX document to document.pdf

$ pdf man.1 # Compile the man page to man.pdf

$ pdf document.md # Compile the markdown document to document.pdf

$ pdf -t > document.tex # Setup the skeleton format of a LaTeX documet

stream

This little utility of mine makes listening/watching to videos/livestreams (either from disk or the internet) easier, and comes with another utility stream-helper which makes controlling a stream instance easier. The basic use of stream is something like:

$ stream -fva # Launch stream without video and select video from dmenu

We can then do something like below to pause the current video/audio

$ echo pp | stream-helper -t

This allows stream to be controlled through other scripts/ programs through the use of stream-helper, example pausing stream with a bind in dwm would simply be something like:

static Key keys[] = {
.
.
{ MODKEY, XK_p, spawn, SHCMD("echo pp | stream-helper -t")},
.
.
}

Then when pressing mod + p, we pause the current stream session.

noted

Lets not forget the client for the noted project! This little script has undergone a few refactors, mainly just focusing on tidying up the functions etc. noted essentially interacts with the API provided by the noted server (aka noted-folder) to upload encrypted (planned to be at least) todo notes, as well as sync them.

Features include

  • Syncing notes between clients and server
  • (Planned) client side encryption using PGP
  • GUI and TUI modes

pm

Although pm is technically some form of wrapper script, since 90% of the arguments passed to hit are simply passed to pass. The part where it is not a wrapper script is when we call

$ pm backup

As this is the feature I injected into my password manager, as I felt there wasnt really a sensible way to backup my passwords to other machines (either on my machines on my local network, or machines on a different network) that was still secure.

Then I it hit me, the most sane way to do this was to simply reuse the features of pass, but with a combination of ssh. Since ssh provides a secure channel I could simply decrypt my passwords and pass them through ssh to the machine I am backing up too, and then that machine sticks the password into pass again. Hence the backup feature was born!

The actual function which handles the backups is simple and is only ~42 SLOC, it can be found here, particular attention should be paid to how I handle the transferring of the passwords (found here), as I transfer all passwords in a single ssh session, which is as optimal as I had hoped for.

If you are interested, pm is fully described in its README, and it is recommended you read if you plan on using pm.

sgrep

Grep implemented (kinda) in bash with a combination of lex and gcc. Is it stupid, yes. But don't stop here, go read the README.

theskiqqybot

Originally starting out as a telegram bot for my server (skiqqy.xyz), it is slowly but surely becoming my personal assistant bot (once I implement my plans for it)

In its current iteration it simply warns me of when my server is down (and or sub services are down), alerting me that something is happening and I should investigate.

ctagd

So one of my favorite things to code is network-related programs. During my third year of university one of my modules (aptly named "Networking") had us group up with one other person and develop several networking-related programs, of which a list can be found at the bottom of this page. This was what spawned my interests of socket programming, and that brings me to today's topic: ctagd.

This article will be focused on socket programming in C (i.e using sys/socket.h, etc). One of the frustrating things for me personally is the complexity needed to setup basic socket communication. Look, I understand why the complexity exists, it is there since the communication has been abstracted to allow the library to achieve more. Let me illustrate some of the complexity with a few examples.

When you aim to do something simple (simple being just passing messages between sockets), at a first glance the C code to accomplish this can seem very verbose. A good example of this is the complexity of this socket function for cmesg, and passing/sending messages here. Granted the cmesg code base isn't exactly very clear, but this is mostly due to the fact that I was making the spec up as I was going along.

Firstly we shall highlight what the base goal of ctagd is (later on we shall expand upon it and add more features). Firstly we want it to be simple, that is easy to use and easy to extend. By keeping it simple it shall make it easier to debug as well as increase performance (as we can focus more on tuning than debugging).

In trying to achieve the base goal of ctagd it naturally brings me to my first point. The unneeded (for what we want to achieve) function parameters that is used when setting up a socket connection (for client and server). There are easily 4 parameters that can be refactored out to reduce complexity. Ideally what we would like when opening a socket is just to get the socket file descriptor. Which we use to identify and differentiate client connections from one another.

The easiest way to simplify this initial initialization process is to abstract all of these parameters into a single struct (for both the server and client). This struct will essentially act as our Server/Client settings, hence adding new Server/Client settings means we only have to update the appropriate struct and init function itself. This means when we extend ctagd it wont break the init process of a server/client that are already in production (i.e we keep backward compatibility). This is exactly what ctagd does and can be found here.

Next we move on to message passing. Now lets say we want to send 'Some String', then the "standard" C way is to do something like this:

/* Assume socket_fd is a valid socket file descriptor */
char *s = "Some String";
int flag = 0;
send(socket_fd, s, strlen(s), flag);

Now when reading in the message the "standard" C way would be something like this:

/* Assume socket_fd is a valid socket file descriptor. */
char buff[129];
read(socket_fd, buff, 128);

buff[128] = 0; /* Null terminator. */
printf("%s\n", buff); /* Print recieved message. */

The eager reader would have noticed a slight limitation/problem with the above segment of code. That being we can only read up to 128 bytes of data, so in the case of us sending 'Some String' it is fine, but what happens if we where to send 256 bytes of data? The result would be that the first 128 bytes of buff will contain the first 128 bytes from our message (and one byte free), but there would still be 128 bytes of data left in the socket that we have to read, to get them we would then have to do something like:

/* Assume socket_fd is a valid socket file descriptor. */
char buff1[129];
char buff2[129];
read(socket_fd, buff1, 128);
read(socket_fd, buff2, 128);

buff1[128] = 0; /* Null terminator. */
buff2[128] = 0; /* Null terminator. */
printf("%s%s\n", buff1, buff2);  /* Print the full message. */

Now before I continue, yes I know we can just increase size of our buffer and bytes read to 256 too solve the problem, but that would defeat the purpose of what I'm trying to illustrate, that being variable message length. In these situations we know how long the messages are and hence can allocate the buffers appropriately. But what if we want variable length messages and do not know their intended length? Hopefully this illustrates the problem to the reader. Normally to solve this issue we send control messages that are fixed in length before we send our actual message, but this requires a lot of tedious setup. Now let me explain how ctagd constructs its messages to deal with this issue.

The concept of a message in ctagd is a simple one. The theory behind it goes as follows: the first byte of our message is the tag (meaning we can have 256 unique tags) followed by 4 bytes that specify the length of the payload (we denote this field as len) and then lastly our payload which is len bytes long. Hence it is of the form:

[byte:tag][4-bytes:len][len-bytes:payload]

This brings me to the concept of an smsg, an smsg is simply shorthand for struct message. Which is exactly how we store the message to make working with its data easier. An smsg makes sending and receiving variable length messages easier, for example:

Sending a message

  • Lets construct a smsg with tag 1 and message "Hello, World!", we do this like so: create_smsg('1', "Hello, World!", smsg_pointer);
  • invoke csend(socket, smsg_pointer);, this will send the smsg to socket.

Receiving an smsg

  • If we have queues enabled (which will be described later) we simply invoke smsg_pointer = recv_tag(some_tag); which will set smsg_pointer to point to the first smsg from that tags queue.
  • If we have queues disabled, we simply invoke cfetch(socket, smsg_pointer); which will set smsg_pointer to point to the smsg that was read from the socket.

As you can see, this makes sending and receiving variable length messages trivial, as cfetch/recv_tag and csend handle all the encoding and decoding of the smsg to and from a struct. For another example you can find a simple server using ctagd here and a simple client here.

Finally we move onto the recent feature that was added in v2.0, queues. The idea behind the queue is that each tagged message gets placed in its own queue based on its tag. We can than make a blocking recv based on tags. Both the server and client have queues and they "look" the same, but the implementation is a little different.

The Server queue is the more complex of the two. For each client connected to the server, there is a thread spawned (dubbed a handler for that client) that will read messages sent from that client and place the messages in the appropriate queue based on its tag. This process is thread safe as mutexes are locked based on the tag. Hence two messages with different tags received from different clients simultaneously may be put in their queues at the exact same time without ambiguous behavior. But when we get two messages with the same tag from different clients simultaneously, the mutexes will lock and only allow one to be queued and then unlock to allow the next to be queued. When we want to remove an smsg from the queue we then lock that queue (once again based on its tag), meaning no smsg (with the same tag) can be added to that queue, preventing ambiguous behavior.

In contrast the client queue is far simpler. The client spawns a single thread which recv's smsg's from the Server and places messages into the appropriate queue (based on the smsg's tag). Here we still need locks (not for receiving multiple messages at once, as this is not possible since we are reading from a single socket) but only for locking the queue to prevent an smsg from being removed whilst a new one is added, as this can result in ambiguous behavior (too the eager reader, this may in fact seem identical to the locks explained above).

Stay tuned for the next post that will do a deep dive on the implementation of ctagd.

~ Skiqqy

GnuSocial

Before I get into the meat of today's topic, first here is a link to the GnuSocial instance I am running inside a docker container. With that out of the way, let us get straight into the article.

I recently took up the task of creating a fully dockerized GnuSocial instance. Effectively building a full LAMP stack inside a container. That is the container contains,

  • L: Alpine Linux
  • A: Apache (Although I'm actually using lighty)
  • M: Mysql/MariaDB
  • P: PHP

Although the LAMP Stack is baked into the container itself, I do plan on abstracting (which wont be hard at all) it out and making a LAMP container, this will allow me to easily host applications based on a lamp stack.

The GnuSocial image repo can be found here, it includes building instructions, something which is outside the scope of this post. Its not 100% complete yet, as the entry-point is not as refined as I would like it to be (I do 1 or 2 things that are kind of scuffed with regards to the DB). Also there are many packages not needed which easily increase the size of the image by ~150mb. Another annoying issue is the fact that ssl seems to not play well with GnuSocial, but I suspect that may be something on my end, as when connecting through https all css is lost and the page basically freaks out. Hence why its currently only running on http.

The final issue (Which should be fixed soon). when creating the image, as sometimes the database takes longer that 15 seconds to start up, in which case the GnuSocial user is not created and the database for GnuSocial is not created, if this does occur then one must simply run,

$ docker exec -it <container name> sh
$ echo "CREATE DATABASE social;" | mariadb
$ echo "GRANT ALL on social.* TO 'social'@'localhost' IDENTIFIED BY 'social';" | mariadb

With the above done, you can actually begin the install. If you have the container running and have attached <port> to the internal port 4000 of the docker container, then one must simply go to http://localhost:<port>/install.sh and follow the instructions from there. Once completed, you can access your GnuSocial instance from http://localhost:<port>/.

This was my second time creating my own Docker image. The first being when I created an image for the cmesg server, the image can be found here. These first two images where my first direct attempts at creating docker images (although I have a lot of experience using other peoples images). Hence there may be one or two mistakes or I may have done something in a rather obtuse manner. I do plan on tidying up these images to remove any bloat.

nginxd

I have just added an a really nice script that has made maintaining my website infinitely easier, what follows is a brief explanation of the role this script plays in maintaining this website as well as design decisions made during its conception.

First let me explain what exactly nginxd does, the idea is actually pretty simple, trivial if you will. In essence nginxd allows me to disable (as well as enable) certain sections of my website. This allows me to gracefully control what is shown to the public, and what is only available on my local network. Before I get into more details, let me first define a few terms,

  • unit: An nginx config that maps a domain to another domain:port.
  • dunit: Essentially the inverse of a unit, except it maps the domain to a specified error page.

The design decisions made during the writing of the script basically boils down to the concept of units and their inverses, dunits. When nginxd is run, it is normally given a unit on which to act (not a dunit), and each unit has an implicit dunit (kind of like how in the field ℝ, for each r∈ℝ, there exists an r' such that r*r'=1). Hence working with the unit is fine, as we can easily deduce the associated dunit. The script was also written in a very modular fashion, allowing for reuse of functions multiple times over. This allowed for easy debugging.

So with the above in mind, what nginxd does is it when we want to enable a service (e.g git), we enable that service's unit whilst removing the associated dunit. This essentially exposes the service to the internet, and users can interact with it. When we wish to disable a service the inverse happens, that is, we enable that service's dunit whilst removing the associated unit, thus when we try reach the service from the internet we are served with an appropriate page stating that the service is currently down.

With the above explained, followed is a short example of how nginxd is used.

Enable <unit>:

$ nginxd -e <unit>

Disable <unit>:

$ nginxd -d <unit>

By default if run without any arguments, nginxd will enable all available units, this is the same as running:

$ nginxd -e all

Disable all units:

$ nginxd -d all

For help, simply run:

$ nginxd -h

nginxd comes with the following units (and each has its own dunit),

  • git
  • irc
  • wiki
  • blog
  • proj

All in all, I must say this is a very useful script I have now added to my toolbox, making the maintaining of skiqqy.xyz a lot easier, Ironically just when I finished the script I made a post about it on mastadon, and after the post I accidentally broke the container running this blog, so I had to disable the blog for a bit to allow me to fix it. Once fixed I simply re-enabled the blog unit and my blog was up again. During the down time a few people actually got served the down page, which was the intended outcome.

That about wraps up this blog, I might delve a bit further into my plans for nginxd in another post at another time.

~ Skiqqy

New vs Old

Once again, I think the title should be unpacked. During this post I shall explain my new setup for hosting my website and how it differs from the old model. This wont be that long of a post as I mostly just want to get to the point.

I suppose we should just jump right in! Currently my server is running on a single raspberry pi, with most subdomains being routed through nginx to a specific docker container to offer that service. I plan on purchasing the newest raspberry pi very soon as I want to start building a cluster using K8's, but this will most likely only happen later in the year (likely in December).

The biggest change from my previous setup was the fact that I didn't actually host the root of my website, which really was unfortunate as it meant updating the website was exceedingly tedious as I could not ssh into the remote machine (the previous host did not offer this...) and hence any changes would force me to use the web application in order to update my website. However! With my new setup this is no longer an issue as I can ssh to my pi when ever I need to, and have also setup sever crontabs to ensure that my domain will always point to my pi's public IP address.

One of the other issues with my previous setup was the fact that the ports where really going wild. The reason I say this is, since I did not use a reverse proxy I was force to encode the port into the url (i.e http://git.skiqqy.xyz was something like http://git.yggpi.co.za:8834) which needless to say was over complicated and could easily result in erroneous behavior.

As for the source code for my website, it is hosted on sourcehut, which is mirrored on my gitea instance. I have my pi pull master every 5 minutes (using a crontab) to ensure the website stays up to date. Another important feature I added was by writing a script which will automatically update my DNS records to point to my pi, hence ensuring maximal up time.

The awesome thing about everything mentioned in the previous paragraph is that, together they ensure that as long as my pi is powered and connected to the internet, then my website (and all services) should be up and running (provided I haven't manually disabled something).

~ Skiqqy

First Post

First, I suppose I should explain the title; that seems fair.

So it all started with my very first blog post (some time earlier this year), found here. Yes it is primitive, I had great plans (and still do!) for it, but alas university claimed a lot of my time this year.

I recently purchased the domain skiqqy.xyz, and have been putting it to use! Since my previous registrar did not provide access to an API, updating the DNS settings of my previous domain, yggpi, proved rather frustrating. Luckily with my new registrar godaddy they have a developer API that has allowed me to automate the job of keeping my DNS records pointing to the correct devices. The script that does this can be found here, the nice thing about this is that my domain will always point to my raspberry pi which is hosting all parts of my website (as I setup a crontab to ensure this script gets run on boot).

I am very happy with the progress of my website so far. The setup is vastly superior to my previous setup (and way more simpler), followed is a brief explanation of the current setup of which I will dive into further detail in another post.

The "core" of my website runs on nginx, which I use as a reverse proxy to map sub-domains to the correct "handler". For the static "root" part of my website is run using lighttpd which is nice and fast. Just about everything else runs inside a docker container (git, irc web chat, this blog, project section), which I shall go into more detail in a later blog post!

I feel as this raps up my first blog post, I plan to have daily (and at least weekly) posts on here detailing everything that I am up to :D

~ Skiqqy