I hope I hate this code one day
I remember the first program I built fully on my own: a music downloader. There was a site that would basically index random
music files found on Google and present you with a way to search them. I guess they would periodically search google
for things like intitle:index.of mp3
(which you can still do)
and store links to the files they found. My program would let you type in a query into your console, search for that
query in the website and allow you to select a file to download.
The code (and the site) is now long gone but the memory of how I parsed the HTML lives on in my head:
files = []
for line in page_html.split('\n'):
if line.startswith('<a class="download'):
url = line.split(' ')[3].replace('href="', "").replace('"', '')
files.append(url)
With the experience I’ve gained since I wrote that code I know how much is wrong with this. What if the HTML is
minified and doesn’t contain newlines? What if the attribute orderings change? What if the download
class changes?
And don’t even think about using regex to parse it.
But at the time I didn’t know how to parse HTML and I didn’t care. I think there’s a certain liberty to that which you lose as you grow as a developer and this hampers your ability to pick up a new language.
I found this was hampering me while learning Rust. I’m in the middle of writing my first Rust project which involves
launching a number of individual subprocesses and displaying their output interleaved in the console. Rust is a powerful
language with a lot of interesting features and I wanted to try and write idiomatic code from the get-go. So for
example I started to try and make use of traits and lifetimes to pass references and avoiding cloning, but I
found I was spending all my time on this and it was getting in the way of my ability to make something that worked.
Sure, passing a reference to a String
that has a correct lifetime is more efficient and likely prettier than
string.clone()
, but while learning I don’t think you should get stuck on things like this.
Get it working, then learn how it should work.
Back to my project: I’m taking the stdout from a number of processes spawned via a rayon parallel iterator. I want to display the current state of each thread in the console and control how they are displayed. For this I went with a standard mpsc channel with multiple senders and one receiver:
fn monitor_threads(rx: Receiver<(ThreadId, String)>) {
let states: Vec<(ThreadId, String)> = vec![];
for (thread_id, msg) in rx.iter() {
// Do we have this state in our states?
if states.iter().any(|(t, _)| t == &thread_id) {
// Update the tuple with the new message
stack.iter_mut().filter(|t| t.0 == thread_id).for_each(
|t| t.1 = msg.clone()
);
} else {
// Push the job state to the vector
stack.push((thread_id, msg.clone()));
}
// Display the states, and erase the previous output
//using ansii escape sequences.
print_thread_states(states);
}
}
I tried to use a HashMap
but that didn’t work for a reason I cannot remember. I’m certain a Vec
is not the best structure here, and I doubt this is the most efficient way of updating it, but it works.
To send updates to it:
fn run_stuff(jobs: Vec<Job>) {
let (tx, rx) = channel();
let monitor_thread = thread::spawn(move || {
monitor_threads(rx);
});
jobs.par_iter().for_each(|j| {
let process = j.spawn_process();
for line in process.lines() {
// Truncate the line to 70 characters and send it to the monitor.
let line = line.trim();
if line.len() >= 70 {
line.truncate(70);
line.push_str("...");
}
tx.send((thread::current().id(), line.clone()));
}
});
monitor_thread.join();
}
I have a feeling that this isn’t great. I’m sure there is a better way to do all of this, like some fantastic way of
avoid the need to clone the String
’s being sent to the channel, or to avoid the inefficiencies around truncating the
output. Maybe there is even an antigravity module that I can use to just avoid this whole mess
entirely.
But, it doesn’t matter. It’s more fun and I’m learning more by not getting lost in the weeds of how I think, from my other experiences, that this should be done and just doing it. And I hope one day I know enough about Rust to look back at this code and hate it in the same way that I did with my music downloader.
I think that’s something we should all aspire to when we are learning a new language.