"There is only one heroism in the world: to see the world as it is, and to love it." - Romain Rolland

Async in Traits Just Save Us

Morris published on 2024-03-10

The Challenge of Implementing the Future Trait for Custom Types

Developers who venture into crafting their own async/await implementations in Rust may encounter the intricate task of implementing the Future trait for their custom types. Rust’s approach to async/await is nuanced, offering a stark contrast to languages like Go, which employs preemptive scheduling. Instead, Rust embraces lazy evaluation and cooperative scheduling, allowing developers to meticulously control the yield points to the executor. This level of control, however, introduces complexity in implementing the Future trait for custom types. The intricacies arise because .await can’t be invoked within a non-async function, necessitating the development of a state machine(or similiar) for these custom types. This endeavor can be laborious and fraught with potential errors, difficult to maintain, and may prompt developers to opt for BoxFuture<T>, a choice that could compromise performance.

Rust Profiling Essentials with perf

Morris published on 2023-10-09

What is profiling?

A: Sampling the program at specific time, and do some statistics analysis.

It can be one of the following:

Reading Backtraces of the program for every 1000th cycles, and represent in ﬂamegraph.
Reading Backtraces of the program for every 1000th cache miss, and represent in ﬂamegraph.
Reading Backtraces of the program for every 10th memory allocation, and represent in ﬂamegraph.
Get return address of the program for every 10th memory allocation, and show counts for every line.

How to trigger a sample?

Kind of triggers:

Serverless with Rust and Protocol Buffers

Morris published on 2023-03-08

I’ve recently been working on rewriting our small service with Rust for fun. One reason is to see if Rust, as a system programming language, is ready for cloud development in 2023. Another reason is that I wonder how much better it is compared to Python or Java in major cloud and data computing.

💡 There is a lot of discussion over which language to use as a serverless service. In my point of view, dynamic languages like Python lack compile-time checks, which can cause more runtime errors than static languages like Rust or Java.
Another reason is that Python and Java need a runtime process (JVM, CPython) to run your actual code, which means they can’t run natively like compiled Rust does. However, I’m not sure about the performance gap. Furthermore, Rust doesn’t require garbage collection, so ideally, a program’s heap size should fluctuate less than a garbage-collected language when running. That means Lambda or Function underlying infrastructure controller should be able to handle invocation or scaling more easily.
In terms of language paradigms, in my opinion, compared to traditional OOP, the modern ML (meta-language) family has a better design for scalable cloud services.

Make Kafka Schema Easier

Morris published on 2022-01-26

TL;DR: You can find the script on my GitHub repository: kschema-table

Coordinating schema with Data Scientist

As a data engineer, you’ll need to collaborate closely with data scientists/domain experts to design data schemas. The optimal schema will depend heavily on the business domain and how the product is used. For instance, in a cybersecurity context, threat experts and data scientists would likely be the ones designing the data schema. They may work with infrastructure and data teams to define these schemas in a common format like, for example: Avro IDL (Avro Interface Description Language).

SSH Tunneling Summary

Morris published on 2020-12-10

There are three types of SSH port forwarding:

Local Port Forwarding

Forwards a connection from the client host to the SSH server host and then to the destination host port.
Remote Port Forwarding

Forwards a port from the server host to the client host and then to the destination host port.
Dynamic Port Forwarding

Creates a SOCKS proxy server that allows communication across a range of ports.

Roles

Client: Any machine where you can ssh to enable Port Forwarding
SSH Server: A machine that can be SSHed into by Client
Target Server: A machine to which you want to establish a connection, usually to open services on this machine to the outside world.

Notice: both Client and SSH Server can be Target Server, it doesn’t really need three machines to perform Port Forwarding! However, Dynamic Port Forwarding will not be only one Target Server, but it can be dynamically determined.