Do you like the content? Consider subscribing here

Tracking my Working Times for 804 Days

Yasin Yousif

2024-04-14 13:25

Analyzing My Personal Time-Tracking Data to Find Recommendations for Optimal Daily Schedules

Introduction

As a student or knowledge worker, time management is essential for achieving success. However, organizing one's schedule can be challenging, for instance one is faced with the problem of distributing work and rest times in optimal time windows. To address this issue, analyzing previous working schedules of an individual may provide useful recommendations for him.

Photo by Jon Tyson on Unsplash

In this post, we will process data collected using a mobile app called Forest [1], which I used to track my daily activities over 804 continuous days with some interruptions during vacation time.

Questions

After preprocessing the recorded time data of trees (each tree represents 40 minutes), we aimed to answer four questions:

What are the best daily working hours to maximize same-day productivity?
What are the best daily working hours to maximize next-day productivity?
What is the optimal starting and ending time for maximum work in a single day?
What is the optimal starting and ending time for maximum work in the following day?

We will attempt to answer these questions by training a Glass Box Model called Generalized Additive Model (GAM) implemented here [2]. For the last two questions, we'll use histograms of corresponding values to find the answers.

Assumptions

While conducting this analysis, several assumptions and restrictions were considered:

The analyzed data is based on my personal habits as a doctoral candidate in computer science; however, it may not be comprehensive since I did not record work time for every day (70% of the days had trees representing working hours), as shown below of the working data:

Although this analysis focuses on my results, they are likely to benefit others and can easily be generalized as demonstrated later in the post.
Productivity is defined here as the amount of work time per day; thus, this study is about quantity rather than quality. While there may be a strong correlation between these two factors, it goes beyond the scope of this analysis to explore other definitions of productivity (see [3] for examples).

Likelihood of Work

To analyze the likelihood of working in different hours throughout the day, we drew a histogram of daily work hours versus the number of trees planted at that hour. This graph provides insight into which hours are more productive and when they occur (see the Figure below). The two peak hours can be observed between 8-13 and 17-18, while there is a drop in productivity from 14-17.

Best Hour for Productivity in the Same Day

To determine how working during each hour independently affects overall daily productivity, we trained GAM using input vectors of trees started at each hour as 24-hour long features. The output of the model represents the number of trees planted that day. One advantage of GAM is its ability to provide a completely transparent white box model, allowing us to extract the exact non-linear relationship between each input feature and the output.

After training the GAM, we identified the six most important hours for productivity based on their influence on daily work time. These graphs are displayed below along with uncertainty values for each graph.

Results from these graphs:

The hours 11, 12, and 17 have the highest influence when two trees are planted in each hour, as they can reach an influence of above 2 for each.
Hours 9, 10, and 13 also show similar patterns but become less productive after one tree. Having zero trees in these hours results in a negative value (-0.5 to -0.25), so it is essential to work at least one tree per hour or risk significant decreases in productivity.
The graphs for the remaining hours show varying levels of influence, but they are generally less beneficial than the previously mentioned hours.

Best Hour for Productivity in the Next Day

It is also interesting to see the effect of the working hours on the next day productivity, because we want to learn about sustainable performance. For example, if I worked for 15 trees one day and then only 5 the next day, then this is something we don't want to encourage. Therefore, we do the same training of the GAM model but with the trees of the next day as the output. We got the following graphs of the most important nine features:

Results from these graphs:

The high uncertainty covers both negative and positive scales, indicating that the data is very noisy. In our case, we can see that there's no influence on working in hours like 10, 12, or 16 for the next day.
If we focus on the means we can see that the hours 10,11,12,13 and 14 have a positive effect for the next day productivity for one tree of work for each.
After one tree, working more in hours like 10 and 13 decrease the influence, and has no influence in hours like 11,12,16
For the interactions, a positive effect is noted for working in both 16 and 10, and for working in either 12 or 20 but not both. And in general, in balancing the work for the hours 8 and 12, as well as 18 and 12.

Best Starting and Ending Hours for Productivity on the Same Day

To determine which range of working hours is most productive, we plotted histograms of both starting hours versus the sum of trees planted in these days and ending hours with the same amount. The results are as follows:

No surprise that the best hour to start is 7, and the optimal range for ending hours is between 18-21 (peaking at around 21). Starting late or early appears detrimental to productivity. Similarly, ending work later than 21 also negatively impacts performance.

Best Starting and Ending Hours for Productivity on the Next Day

We then conducted a similar analysis for next-day productivity:

It's interesting to note that starting at 7 am is beneficial not only for the current day but also for the following day. The same applies to ending hours (peaking at around 18), as it positively affects both present and future work performance.

Conclusion

As final notes from our analysis, it is evident that:

The peak working hours are within inclusive ranges [9-13] and [17-18].
Starting at 7 am is optimal for both same day and next day productivity.
Ending between 18 and 21 pm results in the best performance for both present and future work.

As personal insights, I found it surprising that working early doesn't necessarily lead to increased productivity, beside negatively impacting one's overall mood throughout the day. Additionally, the data highlights the importance of taking breaks during the afternoon (from 14 to 17) as a necessary recovery period for maintaining optimal performance levels.

Lastly, if you find this post interesting and would like to discuss it further or share your thoughts, feel free to leave a comment below. Additionally, if you enjoy reading my work, consider visiting my blog here and subscribing to my newsletter for future posts on data analysis and insights.

References

[1] Forest. Available at: https://www.forestapp.cc/ (Accessed: 14 April 2024).

[2] Nori, H., Jenkins, S., Koch, P., & Caruana, R. (2019). Interpretml: A unified framework for machine learning interpretability. arXiv preprint arXiv:1909.09223.

[3] Newport, C. (2024). Slow productivity: the lost art of accomplishment without burnout . Portfolio/Penguin.

Why Deep Learning Sucks

Yasin Yousif

2024-01-09 12:49

After spending some years studying and using deep learning, I always suffered from the difficulty of debugging errors, or setting hyperparameters. As a researcher this can not only waste additional time, but also money and resources. In this article, we will demonstrate how traditional rule-based methods have a hidden edge (beside simplicity) in solving complex problems that require automation.

Self-driving cars problem:

Let's assume we want to solve the self-driving car problem where the car needs to navigate safely to its destination while avoiding crashing with other objects, as shown in the picture below.

The way of thinking (with deep learning):

A deep learning engineer will start by looking for sub-problems that have already been solved using state-of-the-art networks. For example, they may look directly for object detection models like YOLO and then planning modules like Chauffeurnet. In this case, the task is practically solved; however, as we don't care about what's going on inside these models, we are tempted to just pass the live camera feed regardless of important pre-processing steps (e.g., image enhancement or simple detection of road lanes).

Additionally, deep learning engineers may not consider filtering for noise or taking detection uncertainty into account for the planning of route. This can lead to inaccurate predictions and potential safety hazards.

The way of thinking (without deep learning):

On the other hand, a non-deep learning engineer will start by deeply analyzing how to make the problem solvable through simple logic rules. For example, all we care about in detection is driving within the road. We can achieve this by identifying the road as an easily distinguishable element using image processing steps such as detecting line patterns and asphalt color. Furthermore, we will also consider other traffic entities on the road by employing background subtraction techniques.

For planning, a general algorithm like the social force model can be used to find the shortest path to the destination while avoiding obstacles. This solution may seem less reliable at first glance; however, when calibrated and tested well, its performance-to-investment ratio can be surprisingly high.

Real-world applications of traditional rule-based methods:

Traditional rule-based methods have been successful in a variety of real-world applications where deep learning approaches may not perform as well or are less suitable. Some examples include:

Fraud detection systems that analyze transaction data and identify patterns to detect fraudulent activities, such as credit card transactions with unusual spending habits.
Spam filtering systems used by email services that use rules-based algorithms to identify and filter out unwanted emails based on factors like sender reputation or message content.
Speech recognition systems in call centers where traditional rule-based methods can accurately recognize spoken words, even when there is background noise or accents present.
Robotics applications that require precise control over movements and actions of robots to perform tasks such as picking up objects from a table or assembling components for manufacturing processes.

Limitations of deep learning:

While deep learning has revolutionized many aspects of artificial intelligence, it is not without its limitations. Some challenges associated with deep learning include:

Computational complexity - Deep neural networks can be computationally expensive to train and require large amounts of data for effective modeling.
Data scarcity - In some cases, there may simply not be enough training data available to effectively train a deep learning model, leading to suboptimal performance or overfitting issues.
Lack of interpretability - Deep neural networks can be difficult to understand and interpret, making it challenging for humans to explain how the model arrived at its predictions or decisions.
Difficulty in debugging - Debugging errors in deep learning models can be time-consuming and require specialized skills, such as identifying issues with specific layers of a neural network or optimizing hyperparameters.
Ethical concerns - Deep learning systems may have unintended consequences that could lead to biases or discrimination against certain groups if not properly designed and tested for fairness.

Conclusion:

In this post, we just want to clarify the advantage of the workflow without deep learning. Deep learning makes us lazy, because we just learn to "smash" all the inputs unprocessed together and let the magic happen. The danger here is first in developing this mentatlity and an even bigger danger, is letting such not-fully understood system run indepedntly.

The optimal way is to 'deepen' your own learning, and then lastly to involve some neural networks for the really high-order non-linear relationships in your model.

The unexpected winter of Artifical Intelligence

Yasin Yousif

2023-03-05 07:13

Nowadays, everyone is exicted the latest trends in AI applications, like ChatGPT, self-driving cars, Image synthesis, etc. This overhype is not new, it happened before in the first AI winter in the 1980s. Some warn againt it, because it may cause dispointment and even a new AI winter. But here I will talk about the bottelneck of AI research that I come across in my work. It may be not be called a winter, but it's difinitely will cause a slow down in the field.

What is wrong? Too many models

When a newcomer wants to review tha state of the art in a machine learning field, like objects detection, video understanding, or image synthesis, he will find a lot of models. One approch is to pick just the most notable work in the current year, but this also means that he will miss a lot of other work that may be more suitable for his problem. Another approch is to review all the models, but this is a lot of work, and it's not clear how to compare them.

In the voice of this poor overwhilmed researcher, I would say: "Please, enough with the models, and more coherent thoeories that could add up!".

Research Question

When the research question for the majorty of these papers is viewed, one can see that they have a 'template' reseach question, like:

How to improve the performance of the model?
How to improve the speed of the model?
How to improve the accuracy of the model?

but questions like that should be stated more like:

How to solve the performance problems of the last state of the art model?

Because it's not enough to show a better numeric values as an 'answer'. It should be more detailed to when and why the improvment happens. This is diffently much helpful than just saying that the model outperforms the previous state of the art.

Research Importance

It's known that the deep learning field is expermintal. But if the whole contribution is training and evaulating a model, then this is like saying "I proved it in condition X. Every other condition is not tested yet". This actually helps no one. The contribution should be more than just a technical improvment. It should be a new idea that can be used in other fields, or a new way to solve a problem.

What to do?

Less is more

I think this can start from every researcher. Where making a model should be based on robust math foundations. I'm not saying that should enhance the quality over quantity of the field, but it will be easier to review, because a newcomer could easily fit every new work into the whole picture, and know exactly what is needed for what in every situation.

Math-only models

In fact, the new models can be proven without training. It's a brave claim, I know. But I can give some examples, like Kalman filer paper [1], where the kalman filter model is proven to be optimal in the linear case only mathimatically. Another example is "Visual SLAM, why filter?" [2] where partly the proof is mathmatical for the superiority of bundle adjustment method over kalman filtering method.

If you have a good knowledge in the fields of robotics, you will know that these papers are really famous. So one might say that, not everyone can do that. But why not try at least?. Or just make it clear that the model is not proven, and it's just a good guess while also making effort to explore the theory behind it.

Lastly, GPUs for all

I think that the main reason for the overhype is the availability of GPUs. It's not a secret that GPUs are the main reason for the recent AI boom. But it's not a secret that strong GPUs are not available for everyone.

Therefore, the ones who has the best hardware, will generate better models. This deviate the reasearch from its goals. One soultion could be to provide GPU access for all researchers, like Google is doing [3]. But this could be done in more broad way that grauntate the elimination of hardware hinderance amonge the researchers.

If one day, all the reseach community used one gloabl system, this will also ease the reproduction of the work.

Note:

This article is my personal opnion based on my limited experince in the field. As I'm still learning, I'm open to any feedback and discussion.

References

[1] https://asmedigitalcollection.asme.org/fluidsengineering/article-abstract/82/1/35/397706/A-New-Approach-to-Linear-Filtering-and-Prediction?redirectedFrom=fulltext

[2] https://www.sciencedirect.com/science/article/abs/pii/S0262885612000248

[3] https://edu.google.com/programs/credits/research/?modal_active=none

Train your deep neural network faster with Automatic Mixed Precision

Yasin Yousif

2022-09-23 16:53

Have you been working on deep learning model with big size and wandered how to squeeze every possibility to save your time? or maybe you have the best GPU hardware but still find the speed too slow. Well, look at the bright side. This means you still have room for improvment :)

One option for speeding up the deep learning model training was always stacking more digital circuts in optimized hardware devices like GPUs or TPUs. However, here we show additional option, namely, the adaptive changing of precision in order to save computation time while keeping the same accuracy at the same time.

The idea is simple, use FP32 when it's needed only, for example for small gradients. Otherwise, use FP16 precision when it's enough.

Needed Hardware

Usually you may get some speed up in any hardware type, however, if your device is NVidia (Ampere, Volta or Turing) the speed up is about 3X at best.

To know your device type, just issue the command nvidia-smi

Needed Software

Most popular deep learning framework support this feature, like Tensorflow ,Pytorch and MXNET. Just to show-case, below an example of a network with pytorch is provided

Example

First we need to define the network model:

import torch
from torch import nn


class Model(nn.Module):

    def __init__(self,layer_1=16,layer_2=16):

        super().__init__(Model)

        self.fc1 = nn.Sequential(8,layer_1)
        self.fc2 = nn.Sequential(layer_1,layer_2)
        self.fc3 = nn.Sequential(layer_2,1)

    def forward(self):

        x = self.fc1(x)
        x = nn.functional(x)

        x = self.fc2(x)
        x = nn.functional(x)

        x = self.fc3(x)
        x = nn.functional(x)

Before running the training program , we initilize some dummy inputs/outputs

batch_size = 512
data = [torch.randn(batch_size, 8, device="cuda") for _ in range(50)]
targets = [torch.randn(batch_size, 1, device="cuda") for _ in range(50)]

loss_fn = torch.nn.MSELoss().cuda()

net = Model()

Now, the training program is ran normally as follows (using FP32 precision)

opt = torch.optim.SGD(net.parameters(), lr=0.001)

epochs = 1
for epoch in range(epochs):
    for input, target in zip(data, targets):
        output = net(input)
        loss = loss_fn(output, target)
        loss.backward()
        opt.step()
        opt.zero_grad() 
        # set_to_none=True here can modestly improve performance

If we want to use the special automatic precision, we should wrap the training with a scaler. This scaler will change the precision as needed (between FP32 and FP16)

use_amp = True

opt = torch.optim.SGD(net.parameters(), lr=0.001)
scaler = torch.cuda.amp.GradScaler(enabled=use_amp)

for epoch in range(epochs):
    for input, target in zip(data, targets):
        with torch.autocast(device_type='cuda', dtype=torch.float16, enabled=use_amp):
            output = net(input)
            loss = loss_fn(output, target)
        scaler.scale(loss).backward() #instead of loss.backward
        scaler.step(opt) # instead of opt.step()
        scaler.update() # to prepare for next step
        opt.zero_grad() 
        # set_to_none=True here can modestly improve performance

To check the speedup, you can measure the runtime difference between the two last blocks.

Thanks for reading!

You can find the original post as well as others in my blog-post here

References:

https://developer.nvidia.com/automatic-mixed-precision
https://pytorch.org/tutorials/recipes/recipes/amp_recipe.html