Robert Wright

Fedora 10 years later

February 5, 2025

I was watching a video about Fedora from 10 years ago, presented by Matthew Miller at FOSDEM ‘15 and I was interested in the challenges that the community faced at the time.

Fedora and Linux as distos were declining in search, the way developers interact with packaging was changing, GitHub was disrupting the world of software.

As I look to see the updated chart of how Google indexes the trends, I can see the same pattern Matthew presented in at the event, but I can also see the same decline. The same trends seem to hold true across different search terms as well for other distros.

Of our core four foundations in Fedora – First, Freedom, Features, Friends – I pause and wonder about how users are interacting with Linux of today. With the rise of open source technology over the last decade and developers coming out of college and high school with a better appriciation of working in the open – what does it mean to use a Linux distro to today’s audiences? Is community enough, or is there some sort of interesting evolution for how packaging software together and distributing it will add impact to someone’s life.

Things I start to think about from a user’s perspective when I look at this are things like:

What does the hobby developer use for their OS? Do they have needs and wants that are not captured?
How does Fedora and the downstreams interact and tighten the gap between Enterprise Linux and Community Linux?
Does an experience matter more with boot and go functionality to perform a task like Remixes do or does having a strong base you add upon user by user matter more?
With the advent of data science tools exploding into the world of AI – how do we enable the future developers to do their best work using open source OSes as their foundation?

And on the community’s end – what does Fedora bring for them? As software becomes packaged more and more by the developer in language specific package managers, how can Fedora enable packagers and community members to do what they love? And more so, how can we enable people to love what they do?

I start to think about what brought me back to Fedora – values based work. Doing something that contributes to the broader world. Maybe that’s something worth exploring and showcasing how what Fedora does builds upon being a Digital Public Good and enables the entire world to grow.

At Flock ‘24, I got to see an OLPC for the first time. It had been something I’ve read about since I was a teenager, facinated by the impact and design that a mesh-wifi enabled laptop could bring for kids around the world. I want to see things like what OLPC did using Fedora making an impact on the world. It’s just finding what that next thing might be.

Fields upon Fields in GRC

February 3, 2025

In my consulting and advisory work, conversations with control owners have made one thing clear – more and more First Line of Defense (1LOD) teams are getting bogged down by their own GRC systems. I continue to hear a common pattern of the Second Line (2LOD) teams overcomplicating data capture, adding fields under the assumption that more data equals better compliance.

2LOD functions typically use this information to gain better insight into the businesses activities and to better deturmine risk against the functions they are protecting. The challenge to the business teams is keeping up with the changes and the terminology. I often see fields added with names like “Update Last Review Date” or “Control Audit Operating Effectiveness” without gudidance of the value it brings the business in having this data in the first place. The overwhelming amount of fields turns from an opprotunity to find areas of improvement for the business and leveraging the data to reduce risk to a never ending journey of data governance.

Another startling discovery is the lack of discipline of data governance practices in GRC team activities. Unlike in other enterprise platforms like HRIS and accounting systems, where fields have a stronger tie to controls to meet the needs of the GRC teams – the lack of clarity around the purpose, data ownership/stewardship, and assessment of continued need seems to be driving away businesses from the vision of “integrated risk” and back to tactical GRC solutions.

In former roles in data governance, keeping things simple to get small wins quickly was my mantra to get the basics right, and in thinking of GRC data the same applies true here:

Create a listing of all of your objects (process, risk, control, audit, etc) and fields – and then document who owns that field. You may find that multiple people own the same field and it may launch further conversation.
Document how those objects connect with one another – and why do they connect. It maybe obvious to you why a Risk has Controls, but is it clear to the teams who are making those connections why they’re doing it?
Reassess semi-annually the value of anything you have above – if you don’t need it, either hide it or remove it from your GRC platform.

Can we see trends in the topics?

December 15, 2024

With the topic data and users identified, we can start to see patterns and spot trends in the data which may be more relevant in different ways – such as topics which are fluctuating or tell us usage patterns of our community.

import os
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Read and combine Parquet files
combined_df = pd.DataFrame()
for file in os.listdir("parquet_output"):
    if file.endswith(".parquet"):
        file_path = os.path.join("parquet_output", file)
        try:
            df = pd.read_parquet(file_path)
            df['sent_at'] = pd.to_datetime(df['sent_at'], errors='coerce')
            combined_df = pd.concat([combined_df, df], ignore_index=True)
            print(f"Successfully read {file}")
        except Exception as e:
            print(f"Error reading {file}: {e}")

# Clean data
initial_count = combined_df.shape[0]
combined_df.dropna(subset=['sent_at'], inplace=True)
cleaned_count = combined_df.shape[0]
print(f"Dropped {initial_count - cleaned_count} rows due to invalid '{'sent_at'}'.")

# Assign week start and label
combined_df['week_start'] = combined_df['sent_at'].dt.to_period('W').dt.start_time
combined_df['week_label'] = combined_df['week_start'].dt.strftime('Week of %Y-%m-%d')

# Aggregate distinct users
aggregated_df = combined_df.groupby(['week_start', 'week_label', 'topic'])['username'].nunique().reset_index(name='distinct_user_count')

# Pivot for heatmap
heatmap_pivot = aggregated_df.pivot(index='week_start', columns='topic', values='distinct_user_count').fillna(0)
heatmap_pivot.sort_index(inplace=True)
heatmap_pivot.index = heatmap_pivot.index.strftime('Week of %Y-%m-%d')

# Select top N topics
top_topics = aggregated_df.groupby('topic')['distinct_user_count'].sum().nlargest(20).index
heatmap_top = heatmap_pivot[top_topics]

# Plot heatmap
plt.figure(figsize=(20, 12))
sns.heatmap(
    heatmap_top,
    annot=True,
    fmt=".0f",
    cmap='rocket_r',
    linewidths=0.5,
    linecolor='gray',
    cbar_kws={'label': 'Number of Distinct Users'}
)
plt.title(f'Weekly Distinct Users for Top {20} Topics', fontsize=18)
plt.xlabel('Topic', fontsize=14)
plt.ylabel('Week', fontsize=14)
plt.xticks(rotation=45, ha='right')
plt.yticks(rotation=0)
plt.tight_layout()
plt.show()

In the example above, we can see that on a trending basis, many users are using the COPR build system, but we don’t see as much traffic on the Discourse side to make a top of the list. This could mean people are contributing custom packages using Fedora build infrastructure, but maybe not engaging with the community who are using these packages.

We can also see that there are pretty consistent trends with Pagure until late, with around 5-9 projects being added a week (until recently, which is likely spam).

This may be a good view down the road to see how the community is engaging with Fedora.

Can we see if community comes back?

December 15, 2024

With data available from the message bus, and extracted user context, we can now start to understand this information and use it in interesting ways. In a past career, we used to focus very strongly on how New Hires were performing week on week to understand if changes in training programs were making a difference. Since I worked in the BPO (business process outsourcing) field, this meant we were hiring large volumes of people on a weekly basis.

In the Fedora context, understanding if someone creates a FAS account and then uses it again could be interesting – do people create an account, post on Fedora Discussion once, and then never come back? How do folks become new packagers with engagement? How are we engaging people who want to understand our community? And can we do this without looking at individual users but a community as a whole.

The below Jupyter notebook cell sets out to use the data processed using the grep2parquet Git repo and some preprocessing I’ve done to make this data available. Since it’s not perfect, it may not capture everything, but should show us if people return.

We use the user extracted parquet files to create a weekly group: the week someone created a FAS account and we lump those users together into a Cohort. This Cohort means we’ll start to look at data on a weekly basis based on when you joined so we can see if changing something like how we engage makes a difference. Maybe a new event? Or a badge series for new joiners?

We will then look into the future results of the data and look to see if we saw that user account ever return. For each week – we will track did we see that username return for any reason and then count that in that week.

This cell below will produce the graphic:

# Import required libraries
import os
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from datetime import datetime, timedelta

# Set the path to the directory containing the parquet files
parquet_dir = "parquet_output"

# Initialize an empty DataFrame to store combined data
combined_df = pd.DataFrame()

# Read all parquet files from the directory
for file in os.listdir(parquet_dir):
    if file.endswith(".parquet"):
        file_path = os.path.join(parquet_dir, file)
        df = pd.read_parquet(file_path)  # Read parquet file
        df['sent_at'] = pd.to_datetime(df['sent_at'], errors='coerce').dt.floor('s')  # Parse 'sent_at'
        combined_df = pd.concat([combined_df, df], ignore_index=True)

# Drop rows with invalid 'sent_at' timestamps
combined_df.dropna(subset=['sent_at'], inplace=True)

# Determine the maximum date in the data
max_date = combined_df['sent_at'].max().date()
print(f"Maximum date in data: {max_date}")

# Filter new users based on the topic 'org.fedoraproject.prod.fas.user.create'
new_users_df = combined_df[combined_df['topic'] == 'org.fedoraproject.prod.fas.user.create']

# Assign cohorts based on the week of user creation
new_users_df['cohort_week'] = new_users_df['sent_at'].dt.to_period('W').dt.start_time
new_users_df['cohort_label'] = new_users_df['cohort_week'].dt.strftime('Week of %m/%d')

# Merge cohort info with all activities
activity_with_cohorts = combined_df.merge(
    new_users_df[['username', 'cohort_week', 'cohort_label']], 
    on='username', 
    how='inner'
)

# Calculate weeks since cohort creation
activity_with_cohorts['week_since_cohort'] = (
    activity_with_cohorts['sent_at'] - activity_with_cohorts['cohort_week']
).dt.days // 7

# Group by cohort and week to calculate returning users
weekly_activity = (
    activity_with_cohorts
    .groupby(['cohort_label', 'week_since_cohort'])['username']
    .nunique()
    .reset_index()
)

# Pivot to create a retention table
cohort_retention_table = weekly_activity.pivot(
    index='cohort_label', 
    columns='week_since_cohort', 
    values='username'
).fillna(0)

# Get cohort sizes (number of new users in each cohort)
cohort_sizes = new_users_df.groupby('cohort_label')['username'].nunique()

# Ensure that cohort retention matches the cohort sizes
cohort_retention_table = cohort_retention_table.reindex(cohort_sizes.index)

# **Use the maximum date from the data instead of the current system date**
current_date = max_date  # Replace datetime.now().date() with max_date

print(f"Using current_date as: {current_date}")

# Create an annotated table with returned/size and N/A for invalid future cells
annotated_table = cohort_retention_table.copy()

for cohort in annotated_table.index:
    # Get the cohort start date
    cohort_start = pd.to_datetime(new_users_df[new_users_df['cohort_label'] == cohort]['cohort_week'].iloc[0]).date()
    for col in annotated_table.columns:
        current_week_date = cohort_start + timedelta(weeks=col)
        if current_week_date > current_date:
            annotated_table.loc[cohort, col] = "N/A"
        else:
            returned = cohort_retention_table.loc[cohort, col]
            size = cohort_sizes[cohort]
            annotated_table.loc[cohort, col] = f"{int(returned)}/{int(size)}"

# Normalize weekly activity by cohort size to get retention rates
retention_rate = cohort_retention_table.div(cohort_sizes, axis=0) * 100

# Replace future weeks with NaN so they don't render in the heatmap
for cohort in retention_rate.index:
    cohort_start = pd.to_datetime(new_users_df[new_users_df['cohort_label'] == cohort]['cohort_week'].iloc[0]).date()
    for col in retention_rate.columns:
        current_week_date = cohort_start + timedelta(weeks=col)
        if current_week_date > current_date:
            retention_rate.loc[cohort, col] = np.nan  # Use NaN for blank cells

# Limit to a reasonable number of weeks (e.g., 12 weeks)
retention_rate = retention_rate.iloc[:, :12]
annotated_table = annotated_table.iloc[:, :12]

# Plot the heatmap for weekly cohort retention rates with annotations
plt.figure(figsize=(16, 10))
sns.heatmap(
    retention_rate, 
    annot=annotated_table,  # Display returned/size or N/A
    fmt="",                 # No specific number formatting
    cmap="Blues", 
    cbar_kws={'label': 'Retention Rate (%)'},
    linewidths=0.5,
    linecolor='gray'
)
plt.title('Weekly Cohort Retention Over Time')
plt.xlabel('Weeks Since Cohort Creation')
plt.ylabel('Cohort Week')
plt.xticks(ticks=range(0, 12), labels=[f'Week {i}' for i in range(12)])
plt.yticks(rotation=0)  # Keep cohort labels horizontal
plt.tight_layout()
plt.show()

The image above shows a 12 week view into this data from June 1st, 2024 to December 15th, 2024.

I ran this example above on December 15th, 2024, so future weeks may be blank because they haven’t happened yet, but it does show a pretty strong drop off of activity after only 3 weeks. Of the users who create a Fedora FAS account, on average we retain only 5-6 of those users after 12 weeks. What happened to the other 300? Also – the week of 12/02 and 12/09, there are a lot of messages for invalid accounts. Are we getting more spam sign up than actual users? More questions to be answered.

Let’s filter out anyone who has less than 2 events after their creation date from the cohort groups, meaning we’ll have less users in each week, but more likely real people.

# Identify users who have at least 2 events after Day 0
valid_users = (
    activity_with_cohorts.groupby('username')
    .size()
    .reset_index(name='event_count')
    .query('event_count >= 2')['username']
)

# Filter the activity data for valid users only
activity_with_cohorts = activity_with_cohorts[activity_with_cohorts['username'].isin(valid_users)]

When we remove those users, we still see a similar pattern plus the last two weeks seems more loud than the rest of the dataset, so maybe we’re seeing more spam in the network. When I look up one of those usernames in the Parquet files, I am seeing a lot of events happening really quickly. For privacy, I removed the username from this view / hid the ID from this.

It looks like in the last few weeks, users are signing up to create Pagure events. This likely relates to the remarks folks are making that Pagure is getting spammed with issues. This may inspire a next round of review to understand if we can spot spammers in the bus and help the infrastructure team.

Processing Messagages from the Bus

December 15, 2024

With data available as Parquet files. we can now start to solve the first question: “How many people did something in Fedora on this date?” To do this, we will need to apply some preprocessing to this data to make it usable.

For each message based on topic, we need to extract a username or a FAS ID. Each topic will handle this differently.
Some messages on the bus are malformed or stored with invalid JSON characters. We will need to remove them.
Some messages are system to system messages and have no relevance, we need to remove those.
We need a set of files which have just the timestamp, the ID of the message for debugging, the topic so we know what the activity was, and the identified username.

As a note, this doesn’t solve for when a message contains multiple participants – for example MeetBot will share anyone who talked in a Matrix chat as a participant, but for this example, we are just extracting the messages where we can identify a single user. Another script similar to this to topics which need to be broken out could be built to accomplish this for topics which need to process differently.

The below example code is not fully tested or considered complete – some examples I’ve identified where topics return a full URL instead of the Username string (https://src.fedoraproject.org/user/rwright instead of rwright as an example) but this is a way to start.

import os
import ast
import pandas as pd
import duckdb
import json

# Define the safe JSON format function
def safe_json_format(json_str):
    try:
        # Safely evaluate the string to a Python object
        python_obj = ast.literal_eval(json_str)
        # Convert the Python object to a JSON-formatted string
        json_str = json.dumps(python_obj)
        return json_str
    except (ValueError, SyntaxError):
        # Return None if parsing fails
        return None

# Open a DuckDB connection (use in-memory or specify a file path for persistence)
con = duckdb.connect(database=':memory:')

# Loop over each parquet file
for file_name in sorted([f for f in os.listdir(folder_path) if f.endswith('.parquet')]):
    file_path = os.path.join(folder_path, file_name)
    
    # Load the parquet file with pandas
    df = pd.read_parquet(file_path)
    
    # Preprocess headers and body columns to handle single-quoted JSON
    df['headers'] = df['headers'].apply(safe_json_format)
    df['body'] = df['body'].apply(safe_json_format)
    
    # Register the DataFrame in DuckDB
    con.register('parquet_data', df)

    # Query to extract 'sent-at', 'topic', 'body', and 'username'
    query = """
        SELECT 
            CAST(json_extract(headers, '$.sent-at') AS TIMESTAMP) AS sent_at,
            id,
            topic,
            replace(
                CASE
                WHEN topic LIKE 'org.fedoraproject.prod.badges.badge.award%' THEN json_extract(body, '$.user.username')
                WHEN topic LIKE 'org.fedoraproject.prod.fedbadges%' THEN json_extract(body, '$.user.username')
                WHEN topic LIKE 'org.fedoraproject.prod.discourse.like%' THEN json_extract(body, '$.webhook_body.like.post.username')
                WHEN topic LIKE 'org.fedoraproject.prod.discourse.post%' THEN json_extract(body, '$.webhook_body.post.username')
                WHEN topic LIKE 'org.fedoraproject.prod.discourse.solved%' THEN json_extract(body, '$.webhook_body.solved.username')
                WHEN topic LIKE 'org.fedoraproject.prod.discourse.topic%' THEN json_extract(body, '$.webhook_body.topic.created_by.username')
                WHEN topic LIKE 'org.fedoraproject.prod.mailman%' THEN json_extract(body, '$.msg.from')
                WHEN topic LIKE 'org.fedoraproject.prod.planet%' THEN json_extract(body, '$.username')
                WHEN topic LIKE 'org.fedoraproject.prod.git%' THEN json_extract(body, '$.commit.username')
                WHEN topic LIKE 'org.fedoraproject.prod.fas%' THEN json_extract(body, '$.msg.user')
                WHEN topic LIKE 'org.fedoraproject.prod.openqa%' THEN json_extract(body, '$.user')
                WHEN topic LIKE 'org.fedoraproject.prod.bodhi.buildroot%' THEN json_extract(body, '$.override.submitter.name')
                WHEN topic LIKE 'org.fedoraproject.prod.bodhi.update.comment%' THEN json_extract(body, '$.comment.user.name')
                WHEN topic LIKE 'org.fedoraproject.prod.bodhi%' THEN json_extract(body, '$.update.user.name')
                WHEN topic LIKE 'org.fedoraproject.prod.bugzilla%' THEN json_extract(body, '$.event.user.login')
                WHEN topic LIKE 'org.fedoraproject.prod.waiver%' THEN json_extract(body, '$.username')
                WHEN topic LIKE 'org.fedoraproject.prod.fmn%' THEN json_extract(body, '$.user.name')
                WHEN topic LIKE 'org.fedoraproject.prod.buildsys%' THEN json_extract(body, '$.owner')
                WHEN topic LIKE 'org.fedoraproject.prod.copr%' THEN json_extract(body, '$.user')
                WHEN topic LIKE 'io.pagure.prod.pagure%' THEN json_extract(body, '$.agent')
                WHEN topic LIKE 'org.fedoraproject.prod.pagure.commit.flag%' THEN json_extract(body, '$.flag.user.name')
                WHEN topic LIKE 'org.centos.sig.integration.gitlab.redhat.centos-stream%' THEN json_extract(body, '$.user.name')
                WHEN topic LIKE 'org.fedoraproject.prod.wiki%' THEN json_extract(body, '$.user')
                WHEN topic LIKE 'org.release-monitoring.prod.anitya.%' THEN json_extract(body, '$.message.agent')
                WHEN topic LIKE 'org.fedoraproject.prod.maubot.cookie.give.%' THEN json_extract(body, '$.sender')
                WHEN topic LIKE 'org.fedoraproject.prod.kerneltest.upload.new%' THEN json_extract(body, '$.agent')
                WHEN topic LIKE 'org.fedoraproject.prod.fedocal%' THEN json_extract(body, '$.agent')
                WHEN topic LIKE 'org.centos.prod.buildsys%' THEN json_extract(body, '$.owner')
                WHEN topic LIKE 'org.fedoraproject.prod.badges.person.rank.advance%' THEN json_extract(body, '$.person.nickname')
                ELSE NULL
            END::TEXT, '"', ''
        ) AS username
        FROM parquet_data
        WHERE headers IS NOT NULL AND body IS NOT NULL
    """
    
    # Execute the query and fetch the result as a DataFrame
    result_df = con.execute(query).fetchdf()

    # Remove rows with empty 'username'
    result_df = result_df.query("username != ''").dropna(subset=['username'])

    # Define the output parquet path with the original parquet filename
    output_parquet = os.path.join(output_folder, f"{os.path.splitext(file_name)[0]}_processed.parquet")

    # Write the result to a parquet file
    result_df.to_parquet(output_parquet, index=False)
    print(f"Data written to {output_parquet} successfully.")

    # Clear the view for the next file
    con.unregister('parquet_data')

print("Data processed and written to individual parquet files successfully.")

# Close the DuckDB connection
con.close()

grep2parquet

December 15, 2024

Within the Fedora Project, we have a Messaging Bus available for the various applications which are interacting within the community, sending out updates whenever something happens: package updates, builds, test results, forum posts, meetings, and more.

Coming from my data & reporting days, I know the priority for an organization is not necessarily getting the precise truth for all reports – many cases the average of a figure and a number which has enough backing to it allows the business to focus on something else more key instead – what to do about the number.

In Fedora’s case, having access to the message bus allows us to tap into this knowlege and start to look to answer questions about the Community Health overall iwhtout requiring expensive integrations across the variety of platforms the project uses. Even better, the community already stores and logs this data into a tool for historical purposes called Datanommer, making the data available to us via a HTTP REST API called Datagrepper.

What’s All This Data About?

Each message on the bus contains a variety of information mostly useful for other applications to take advantage of – but by topic we can see some important information start to emerge:

https://apps.fedoraproject.org/datagrepper/v2/id?id=1cf56046-167e-45b3-9f5c-4830720d6797&is_raw=true&size=extra-large

{
  "body": {
    "build": 8395718,
    "chroot": "fedora-rawhide-x86_64",
    "copr": "PyPI",
    "ip": "2620:52:3:1:dead:beef:cafe:c108",
    "owner": "@copr",
    "pid": 2480762,
    "pkg": "python-pytest-black",
    "status": 1,
    "user": "ksurma",
    "version": "0.4.0-1",
    "what": "build end: user:ksurma copr:PyPI build:8395718 pkg:python-pytest-black version:0.4.0-1 ip:2620:52:3:1:dead:beef:cafe:c108 pid:2480762 status:1",
    "who": "backend.worker-rpm_build_worker:8395718-fedora-rawhide-x86_64"
  },
  "headers": {
    "fedora_messaging_schema": "copr.build.end",
    "fedora_messaging_severity": 20,
    "fedora_messaging_user_ksurma": true,
    "priority": 0,
    "sent-at": "2024-12-15T17:09:56+00:00"
  },
  "id": "1cf56046-167e-45b3-9f5c-4830720d6797",
  "priority": 0,
  "queue": null,
  "topic": "org.fedoraproject.prod.copr.build.end"
}

In this message, we see that a COPR build just finished based on it’s topic, the message was sent at December 15th, 2024 at 5:09 PM UTC, and likely user ksurma sent this message from a COPR build action. We also know the package from the message was “python-pytest-black”.

If we wanted to start to think about using this data – there’s already a trove of great information available to start thinking about:

How many people are using COPR?
How many people (or bots) are sending messages?
How is our community growing? What services to do they use?
And more…

How can we get access to this data?

To start thinking about this data we have a few ways users can get access to the message bus data:

Create a consumer application to listen on the bus and capture events (which is what Datanommer / Datagrepper do today)
Gain access to the Datanommer PostgreSQL database hosted on Fedora Infra
Consume the data from Datagrepper and make it available locally

After some understanding of where the community is today, creating a new tool and a new database just to examine the data of the community isn’t a great way to go. We have the tools, we just need to get access to the data.

In Community Operations (CommOps), we’ve been looking to make either the last two options be the priority. The challenges with the Infra access is that:

The direct PostgreSQL database is only available to users who are infrastructure apprentices, meaning we have to have a high amount of trust to users to login and use the production database. While I’ve done this myself, it’s made me nervious of how hard some of my queries might be against the system and how to scale this well – sharing passwords won’t work.
We also examined making a middle layer – like a BI platform available to the community. Since the database is a full fledge PostgreSQL database, any open source solution like Metabase, Apache Superset, and even this Django SQL Explorer module have been what we’ve examined. The problem being is hosting this application near or next to this database is more of a challenge than we anticipated. While we are working with Infra to make a copy of Datanommer’s database in the cloud for the community to use, it’s likely months out or longer.

grep2parquet

That brings us to our last option, and using a quick tool I’ve thrown together to get some movement on this: “grep2parquet.” It fetches historical data from DataGrepper REST API—basically a getting a batch of messages by day—and transforms it into Parquet format, a common choice for data analysis and which allows us to start doing something with this data.

The problem with this approach is it can be inequitable for some community members:

You must have appropriate disk space to store a copy of the data you need. Since you have to pull raw messages, this can grow (a single day is around 50mb of data compressed).
You have to have enough bandwidth to download this much data. Since it’s a large volume, it might take some time to just get started.

This also puts a drain on Fedora’s resources because we’re sending a lot of requests to the REST API in rapid order to get this data since we can only get 100 messages at a time for the bus. For now, this approach works – but in the future, I want to examine if we can store the predownloaded files somewhere in Fedora infra which require less machine processing and just require users to get what they need.

For now, this at least let’s us start moving – and some progress is better than none. I hope to share some updates soon on next steps.

Remembering FreePCTech

December 14, 2024

When I was growing up, my father, Bob Wright, ran a online community and resource hub called FreePCTech.com along with the mailing lists PCSOFT and PCBuild. He was an avid technologist with a mind to what PCs could do. In many ways, my childhood home was a place of technology exploration and he shared one of his passions with me as I grew up – a love of computers and technology.

It was through my dad’s business partner, Drew Dunn, that I first encountered Linux. Specifically Red Hat Linux (someone once told me as a joke – you don’t choose your distro, it chooses you). My dad and I would sit at his office area in our house, burning and labeling media his organization made available for purchase to folks who couldn’t download OS ISOs (back in the days when bandwidth was expensive!).

I remember in my pre-teens, struggling the first time I tried to install a Linux distribution. I kept getting stuck on partitioning screen in Anaconda trying to get it to install and not understanding anything about the concept of swap space and why or how to partition my disk properly – having never seen anything like this before on a Windows machine where I had been coming from! After a few attempts, Drew and my dad stepped in, patiently explaining to me what Swap was, and why I even needed it to install it. I got it to work – and I started on a longer journey from desktop environments, to browsers, and even remember the launch of Mozilla Firefox!

Although my life eventually led me away from the days of burning Linux disks and helping my dad with FreePCTech, the lessons I absorbed then remained at the core of how I approached technology. Those early encounters taught me that curiosity, collaboration, and an open mind are fundamental. And while my dad and I didn’t always see eye to eye—he helped show me a path that I would continue to walk for years to come. I continue to be grateful for FreePCTech setting such a strong foundation for me.

I Love Free Software Day

February 14, 2024

At FOSDEM while working at the Fedora booth, lots of different people approached us to talk about Fedora Linux, the various communities they were in, and their interests.

On Sunday, a small group walked up to the booth and asked me, “Do you know what I Love Free Software Day is?”. Having not been broadly in the open source community until last year, I asked them what it was.

Every February 14th, community members say Thank You and show appreciation to contributors in open source communities and the amazing things they do. The Free Software Foundation Europe team hold this as an annual event and send postcards to different communities.

The team introduced themselves as members of the FSFE and they had a simple ask – “Are you up for a challenge?”. How could I say no?

They gave me a set of materials and a card with a link to a Git repo and wished me luck. I placed it in my backpack and went along with the rest of the day handing out stickers to everyone who walked by.

After returning home to Portland, I set out to try to figure out how to do this. I needed to take this Arduino and somehow make this thing flash lights. I am not a hardware guy, and the FSFE team warned me I needed to be able to solder.

I’ve never soldered anything before nor do I own a soldering iron. So the first task was to acquire one of those. After visiting Home Depot, I managed to find both electrical solder and a soldering iron.

After reading the instructions briefly and figuring out how hard it could be, I fired it up.

The first attempt I made at soldering was the grey wire you see below. As someone who is not a hardware person and has not used a soldering iron before, this was exceptionally difficult.

I knew I needed to heat the solder and tap the tip against the wire. I prepped each wire by cutting a bit of the tip down and got them lined up. And afterward, I attempted to make a connection to the first wire.

After fussing with it for quite some time, I ended up pausing and finding a YouTube video since it was 3 in the morning where I was, and I couldn’t call my brother, who is an actual engineer, to ask for help.

The video instructed me to prep the iron and to get a sponge to help dampen and remove some of the excess solder from the iron as I was working. This seemed to help as my two and three attempts of soldering wires were much smoother. I tried my best to clean up the first wire where I could.

I cleaned up the cabling, plugged in the wires to the Arduino, and went to work to compile the included software.

This part was much more straightforward as I could read code. I also like to just do things until they work, as I can do easily with software, so installing the PlatformIo on my Fedora laptop was a breeze. The script that was included was compiled successfully, and I was able to flash it without any error to the Arduino.

I waited for a few moments after the flash was complete, expecting to see some lights. But sadly as more time went on, and the lights did not activate, I wondered if my soldering job was behind this.

I looked back at the Git repo that was included with the project and reread the instructions to make sure I didn’t miss something when I discovered my fatal mistake.

Oops – if you noticed above, I soldered the wires to the wrong end of the LED strip.

After a quick cut of the wires and the blessing in disguise to get to resolder all the wires, I ended up with a new, cleaner version of the wire connections.

See – much better.

And while holding my breath, I reconnected the wires to the Arduino and plugged it back in. 3 seconds later…

We have light! I did it! It started to glow lights as it went along! I tidied up the case, attached the strip to the two ends, and then attached the Arduino to the back of the heart. It now glows slowly as it covers my desk like this:

Now it wouldn’t be I Love Free Software Day if I didn’t say thank you to the amazing contributors and community at the Fedora Project. There are so many of you I’ve met over the past year who have made my life better by welcoming me into the community. Our Friends foundation is what matters most, as when we are working together and helping each other – we’re able to do amazing things!

I am grateful for everyone, not only in the Fedora Project but also in the open-source communities around the world who make technology better for everyone! Thank You!

And for this heart – I will find it a good home in the Fedora community. <3