Compare commits
35 Commits
d235c415b8
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 60d6df4c82 | |||
| 01f24fe8f3 | |||
| 6f72cd3711 | |||
| bcfb6a4361 | |||
| 9eebcb317e | |||
| 8efeca9df0 | |||
| b97aa0ba68 | |||
| 66f534e216 | |||
| 3010bafb25 | |||
| b113098f48 | |||
| d51591cd05 | |||
| 9630a14124 | |||
| 7e27faef70 | |||
| c52d4dd109 | |||
| 6d5ab66ed9 | |||
| f3efad590a | |||
|
|
4abdca4a42 | ||
| 47907c86fa | |||
| 611b65cac8 | |||
|
|
2600c53ff8 | ||
|
|
5a5564b770 | ||
|
|
b78805b312 | ||
| 38a1bd6b32 | |||
| b22b58c935 | |||
| 1ebf8f2581 | |||
| e8b2bcd99b | |||
| f5f34d864f | |||
| c30b8214e0 | |||
| 18c36ee123 | |||
| e748c623c7 | |||
| c5294238a9 | |||
| c494638cb7 | |||
| 27bb51cb39 | |||
| 29ff5b5b96 | |||
| f09b4b8186 |
5
.gitignore
vendored
Normal file
5
.gitignore
vendored
Normal file
@@ -0,0 +1,5 @@
|
||||
.hugo_build.lock
|
||||
public
|
||||
resources
|
||||
|
||||
|
||||
3
.gitmodules
vendored
Normal file
3
.gitmodules
vendored
Normal file
@@ -0,0 +1,3 @@
|
||||
[submodule "themes/hugo-theme-terminal"]
|
||||
path = themes/hugo-theme-terminal
|
||||
url = https://github.com/panr/hugo-theme-terminal.git
|
||||
54
config.toml
54
config.toml
@@ -13,10 +13,10 @@ paginate = 5
|
||||
contentTypeName = "posts"
|
||||
|
||||
# ["orange", "blue", "red", "green", "pink"]
|
||||
themeColor = "pink"
|
||||
themeColor = "green"
|
||||
|
||||
# if you set this to 0, only submenu trigger will be visible
|
||||
showMenuItems = 0
|
||||
showMenuItems = 2
|
||||
|
||||
# show selector to switch language
|
||||
showLanguageSelector = false
|
||||
@@ -43,35 +43,36 @@ paginate = 5
|
||||
# updatedDatePrefix = "Updated"
|
||||
|
||||
# set all headings to their default size (depending on browser settings)
|
||||
# oneHeadingSize = true # default
|
||||
oneHeadingSize = false
|
||||
|
||||
# whether to show a page's estimated reading time
|
||||
# readingTime = false # default
|
||||
readingTime = true
|
||||
|
||||
# whether to show a table of contents
|
||||
# can be overridden in a page's front-matter
|
||||
# Toc = false # default
|
||||
Toc = true
|
||||
|
||||
# set title for the table of contents
|
||||
# can be overridden in a page's front-matter
|
||||
# TocTitle = "Table of Contents" # default
|
||||
|
||||
|
||||
[languages]
|
||||
[languages.en]
|
||||
languageName = "English"
|
||||
title = "Flow With Halvor"
|
||||
subtitle = "Basic Security Blog"
|
||||
owner = "Paul Halvorsen"
|
||||
keywords = ""
|
||||
title = "Flow With Halvo"
|
||||
copyright = ""
|
||||
menuMore = "Show more"
|
||||
readMore = "Read more"
|
||||
readOtherPosts = "Read other posts"
|
||||
newerPosts = "Newer posts"
|
||||
olderPosts = "Older posts"
|
||||
missingContentMessage = "Page not found..."
|
||||
missingBackButtonLabel = "Back to home page"
|
||||
|
||||
[language.en.params]
|
||||
subtitle = "Basic Security Blog"
|
||||
owner = "Paul Halvorsen"
|
||||
keywords = ""
|
||||
menuMore = "Show more"
|
||||
readMore = "Read more"
|
||||
readOtherPosts = "Read other posts"
|
||||
newerPosts = "Newer posts"
|
||||
olderPosts = "Older posts"
|
||||
missingContentMessage = "Page not found..."
|
||||
missingBackButtonLabel = "Back to home page"
|
||||
|
||||
[languages.en.params.logo]
|
||||
logoText = "Flow With Halvo"
|
||||
@@ -82,12 +83,25 @@ paginate = 5
|
||||
identifier = "about"
|
||||
name = "About"
|
||||
url = "/about"
|
||||
weight = 10
|
||||
[[languages.en.menu.main]]
|
||||
identifier = "showcase"
|
||||
name = "Showcase"
|
||||
url = "/showcase"
|
||||
identifier = "posts"
|
||||
name = "Posts"
|
||||
url = "/posts"
|
||||
weight = 20
|
||||
[[languages.en.menu.main]]
|
||||
identifier = "ideas"
|
||||
name = "Ideas"
|
||||
url = "/ideas"
|
||||
weight = 25
|
||||
[[languages.en.menu.main]]
|
||||
identifier = "podcasts"
|
||||
name = "Podcasts"
|
||||
url = "/podcasts"
|
||||
weight = 30
|
||||
[[languages.en.menu.main]]
|
||||
identifier = "personal"
|
||||
name = "Other Posts"
|
||||
url = "/pposts"
|
||||
weight = 40
|
||||
|
||||
|
||||
@@ -1,32 +1,41 @@
|
||||
# Summary
|
||||
---
|
||||
title: "About"
|
||||
date: 2025-08-25
|
||||
draft: false
|
||||
---
|
||||
|
||||
I'm a Software Engineer with over 11 years development and 15 years professional experience, with exposure to C, Python, PHP, JavaScript, Java, and C++ languages; various SQL databases; JQuery and Pytest frameworks; Docker containerization; and Rest API, JSON, XML, and nginx technologies.
|
||||
## Summary
|
||||
|
||||
# Work Experience
|
||||
My name is Paul and I'm a Software Engineer with over 13 years development and 17 years professional experience, with exposure to Rust, C, Python, PHP, Go, JavaScript, Java, and C++ languages; various SQL databases; JQuery and Pytest frameworks; Docker containerization; and Rest API, NATS, JSON, XML, and nginx technologies.
|
||||
|
||||
## Binary Defense
|
||||
## Work Experience
|
||||
|
||||
### Binary Defense
|
||||
|
||||
**Sr Software Engineer**: April 2022 - Present
|
||||
|
||||
- Rust development using cargo, cmake, and cross compilation
|
||||
- Python development using pyenv, pipenv, cython, docker build environment, gitlab pipelines, and static compilation
|
||||
- Develop security alarms for Windows, Linux (Debian and RedHat), and MacOS
|
||||
- Written RFC and ADR to drive design and decision making on project direction
|
||||
- Design and build containment for all platforms upon detected compromise
|
||||
- Design and build secure key exchange and connections
|
||||
|
||||
## Kyrus Tech
|
||||
### Kyrus Tech
|
||||
|
||||
**Sr Software Engineer**: Nov 2020 - April 2022
|
||||
|
||||
- Perform test driven development for all tasks: C, Python/Pytest, Docker, GitLab CI/CD
|
||||
- Perform test driven development: C, Python/Pytest, Docker, GitLab CI/CD
|
||||
- Build covert communications and file transfers proxy: HTTPS, Apache Thrift, Rest API
|
||||
- Design compact router fingerprinting and vulnerability analysis: Android, HTTPS, TCP/IP, StreamCypher Encryption
|
||||
- Modify existing code to suppress logging from inside the Linux Kernel: various Linux Kernel versions, Ghidra
|
||||
- Modify existing code to suppress system logging from Linux Kernel module: various Linux Kernel versions, Ghidra
|
||||
|
||||
## Parsons
|
||||
### Parsons
|
||||
|
||||
**Cyber Security Software Engineer**: Apr 2018 - Nov 2020
|
||||
|
||||
- Continue development of covert Windows application: C, C++, Python
|
||||
- Build modular solution for plug and play architecture
|
||||
- Build modular solution for plugin architecture
|
||||
- Design custom API for minimal data transfer to back-end
|
||||
- Encrypt storage and comms using AES shared key to maintain confidentiality and integrity
|
||||
- Build prototype back-end service for file storage and search: Java, Tomcat, Niagarafiles (NiFi), nginx, Hadoop, MySQL, LDAP, RBAC
|
||||
@@ -34,7 +43,7 @@ I'm a Software Engineer with over 11 years development and 15 years professional
|
||||
- Track and maintain multi-level user access
|
||||
- Generate metadata for searching
|
||||
|
||||
## NSA
|
||||
### NSA
|
||||
|
||||
**Security Software Engineer**: Nov 2011 - Apr 2018
|
||||
|
||||
@@ -54,7 +63,7 @@ I'm a Software Engineer with over 11 years development and 15 years professional
|
||||
- Organize, train, and participate in team performing 24x7 call-in rotation
|
||||
- Responsible for 5+ domestic and foreign system deployments
|
||||
|
||||
## Salisbury University
|
||||
### Salisbury University
|
||||
|
||||
**Software Developer**: Nov 2006 - May 2008
|
||||
|
||||
@@ -69,35 +78,18 @@ I'm a Software Engineer with over 11 years development and 15 years professional
|
||||
- Maintain the Linux labs on campus: dual boot OpenSUSE, WindowsXP, and OpenSUSE server
|
||||
- Perform backups, updates, user management (LDAP), disk quotas, and remote access
|
||||
|
||||
# Education
|
||||
|
||||
University of Maryland Baltimore Campus
|
||||
|
||||
: Masters in Computer Science; 2013. Thesis: "Stateless Detection of Malicious Traffic: Emphasis on User Privacy"
|
||||
|
||||
Salisbury University
|
||||
|
||||
: Bachelors in Computer Science; 2009. Magna Cum-Laude
|
||||
|
||||
Security+
|
||||
|
||||
: ID: COMP001021281239; Exp Date: 04/04/2024
|
||||
|
||||
Royal Military College (RMC Canada)
|
||||
|
||||
: Training in OpenBSD development and administration
|
||||
|
||||
# Miscellaneous
|
||||
|
||||
RedBlue Conference
|
||||
|
||||
: Presented combination web enumeration/exploitation tool
|
||||
|
||||
National Conference for Undergrad Research (NCUR)
|
||||
|
||||
: Presented development of STK scenario building and manipulation
|
||||
|
||||
SANS Courses
|
||||
|
||||
: Staying up-to-date on security research
|
||||
## Education
|
||||
|
||||
- **University of Maryland Baltimore Campus**: Masters in Computer Science; 2013. Thesis: "Stateless Detection of Malicious Traffic: Emphasis on User Privacy"
|
||||
- **Salisbury University**: Bachelors in Computer Science; 2009. Magna Cum-Laude
|
||||
- **Security+ (Expired)**: ID: COMP001021281239; Exp Date: 04/04/2024
|
||||
- **Royal Military College (RMC Canada)**: Training in OpenBSD development and administration
|
||||
|
||||
## Miscellaneous
|
||||
|
||||
- **RedBlue Conference**: Presented combination web enumeration/exploitation tool
|
||||
- **National Conference for Undergrad Research (NCUR)**: Presented development of STK scenario building and manipulation
|
||||
- **SANS Courses**: Staying up-to-date on security research
|
||||
- **Homelab**: Running email, cloud storage, gitea, DNS, multimedia, geneology, and static web page services
|
||||
- **Web Admin for PTA**: Setup and maintain a Wordpress site
|
||||
|
||||
|
||||
47
content/ideas/ideas.md
Normal file
47
content/ideas/ideas.md
Normal file
@@ -0,0 +1,47 @@
|
||||
---
|
||||
title: "Blog Ideas"
|
||||
date: 2025-08-25
|
||||
draft: false
|
||||
---
|
||||
|
||||
## Ideas
|
||||
|
||||
- Setting up proxmox
|
||||
- Migrating from one proxmox server to another
|
||||
- Writing a Baseball Game predictor
|
||||
- Getting the data
|
||||
- Formatting the data
|
||||
- Database
|
||||
- Getting weather data
|
||||
- Historic
|
||||
- Prediction
|
||||
- Using Python neural net
|
||||
- Bad Malware Analysis
|
||||
- Average value per string
|
||||
- Call tree depth
|
||||
- Number of library/function calls
|
||||
- Put together multiple test detection
|
||||
- Size of binvs function calls
|
||||
- Total value/average of total strings
|
||||
- Bad password Analysis
|
||||
- Ascii codes
|
||||
- Grid map of sequences
|
||||
- Heat map on phone keyboard
|
||||
- Number of keyboard shifts on phone
|
||||
- How good requirements help security
|
||||
- Selfhosted email setup
|
||||
- Rust security features
|
||||
- Silence on the wire, is it still relevant
|
||||
- Tangled web, is it still relevant
|
||||
- Summations of chapters on secure coding
|
||||
- How the "spirit of c" can lead to issues 21
|
||||
- Stack protection 109, 111
|
||||
- Tables on 92,93
|
||||
- Watch out for compiler optimization 37, 40, 41, 152, 153
|
||||
- iNaturalist
|
||||
- What is it and how to use it
|
||||
- Possible analysis
|
||||
- Exercise
|
||||
- My goals
|
||||
- Calculating progress
|
||||
- Graphing progress
|
||||
48
content/podcasts/index.md
Normal file
48
content/podcasts/index.md
Normal file
@@ -0,0 +1,48 @@
|
||||
## README
|
||||
|
||||
### How to Use
|
||||
|
||||
Each of the following podcasts is a link to an RSS feed. In order to use them:
|
||||
|
||||
1. Copy the link
|
||||
2. Go to the podcast app and add a new podcast
|
||||
3. Select add using RSS feed
|
||||
4. Paste the copied link
|
||||
|
||||
### Caviate
|
||||
|
||||
1. These are podcasts created from downloading YouTube videos from a playlist. As such their original format was visual, so some information my be lost being audio only.
|
||||
2. Since these are pulling from YouTube if the original uploader didn't date them properly they could be out of order.
|
||||
1. From what I've seen, most are in order but just be aware.
|
||||
|
||||
## Podcasts
|
||||
|
||||
### Crash Course History
|
||||
|
||||
- [Big History 1](https://podcast.halvo.me/big_history_1.xml)
|
||||
- [Big History 2](https://podcast.halvo.me/big_history_2.xml)
|
||||
- [World History 1](https://podcast.halvo.me/world_history_1.xml)
|
||||
- [World History 2](https://podcast.halvo.me/world_history_2.xml)
|
||||
- [US History](https://podcast.halvo.me/us_history.xml)
|
||||
|
||||
### Crash Course Other
|
||||
|
||||
- [Literature 1](https://podcast.halvo.me/literature_1.xml)
|
||||
- [Philosophy](https://podcast.halvo.me/philosophy.xml)
|
||||
- [Religions](https://podcast.halvo.me/religions.xml)
|
||||
- [Wold Mythology](https://podcast.halvo.me/world_mythology.xml)
|
||||
|
||||
### Storied
|
||||
|
||||
- [Monstrum](https://podcast.halvo.me/monstrum.xml)
|
||||
- [Fate and Fabled](https://podcast.halvo.me/fatenfabled.xml)
|
||||
- [Otherwords](https://podcast.halvo.me/otherwords.xml)
|
||||
|
||||
### Overly Sarcastic
|
||||
|
||||
- [Trope Talk](https://podcast.halvo.me/tropetalk.xml)
|
||||
|
||||
### Hank Green
|
||||
|
||||
- [All Hank Green](https://podcast.halvo.me/hankgreen.xml)
|
||||
|
||||
@@ -4,13 +4,13 @@ date: 2019-08-01
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Security Blog
|
||||
## Security Blog
|
||||
|
||||
This blog is various summaries of minor research, reading, and independant learning in regards to computer security.
|
||||
|
||||
Mostly this blog is to satisfy the requiremnts for my Security+ certificate.
|
||||
|
||||
# Cert ID
|
||||
## Cert ID
|
||||
|
||||
Security+ ID: COMP001021281239
|
||||
|
||||
|
||||
@@ -4,13 +4,13 @@ date: 2020-03-06
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Introduction
|
||||
## Introduction
|
||||
I'm thinking of doing a series on bad malware analysis. Hopefully it'll be fun and at least a little informative.
|
||||
|
||||
|
||||
Today's post consists of performing a string analysis on malware. Where most string analysis looks at the big picture, I thought I would take it a step further and look at individual characters. This approach is terrible, as you will soon see.
|
||||
|
||||
# Why Strings
|
||||
## Why Strings
|
||||
|
||||
If you've made it this far, I'm assuming you already have some basic knowledge of computers and maybe even looking at malware. As such, you may already know what string analysis is all about, but here is a quick crash course on strings.
|
||||
|
||||
@@ -23,7 +23,7 @@ In order for a signature to be created from strings, it needs to be very very sp
|
||||
|
||||
Indicators are a little more practical for use with strings. The more indicators the more confident you can be that this is a piece of malware.
|
||||
|
||||
# Why Characters
|
||||
## Why Characters
|
||||
|
||||
Now for my terrible way of using strings ... character analysis.
|
||||
|
||||
@@ -33,21 +33,21 @@ Cutting to the chase, if a piece of software has a lot of the following characte
|
||||
|
||||
v j ; , 4 q 5 /
|
||||
|
||||
## Why Those Characters
|
||||
### Why Those Characters
|
||||
|
||||
How did I come to such a wild conclusion that v's and j's are a problem ... time for some terrible analysis.
|
||||
|
||||
## Where Are the Samples
|
||||
### Where Are the Samples
|
||||
|
||||
To perform my analysis, I pulled down around 500 samples of malware from theZoo (https://thezoo.morirt.com/) and dasMalwerk (https://dasmalwerk.eu/). For samples of benign software I grabbed all of /bin on Fedora and 200 libraries from C:/Windows directory.
|
||||
|
||||
# How Was it Analysed
|
||||
## How Was it Analysed
|
||||
|
||||
Next I wrote a python program to run strings, loop through each individual character, make them lowercase, then count. This was done for both malware and benign samples, then compared in two ways:
|
||||
1. Count the total number of characters in the malware samples and the total number in the benign. Then subtract the two. Sort and look
|
||||
2. Take the ratio of each character count to the file size for the malware and benign samples. Average that across all files, then subtract and compare. (don't worry I'll explain)
|
||||
|
||||
## Basic Count
|
||||
### Basic Count
|
||||
|
||||
The basic count is fairly self explanatory, just keep a running tally of characters and subtract. Here are the top ten characters most likely and least likely to be in malware:
|
||||
|
||||
@@ -59,7 +59,7 @@ This is terrible for many reasons, but specifically because it is un-weighted. S
|
||||
|
||||
I wanted to find a way to weigh the characters, such that a single sample couldn't skew all of thebad-malware-analysis-character-count results.
|
||||
|
||||
## Ratio Analysis
|
||||
### Ratio Analysis
|
||||
|
||||
Here's where it get's more complicated and I'll try to explain.
|
||||
1. Keep a running tally per malware sample (not a total for all samples)
|
||||
@@ -75,7 +75,7 @@ _ s g r o $ f i a "
|
||||
|
||||
Obviously this gives pretty big differences: double quote went from being the worst offender to most benign. Using the ratio gives a much better analysis since it doesn't allow a single sample to skew the results.
|
||||
|
||||
# So What Now
|
||||
## So What Now
|
||||
|
||||
Using the ratio is probably good on it's own, so how did I come up with my character dirty list. I looked at the worst offenders of both ways to analyse the code and came up with the list:
|
||||
|
||||
|
||||
@@ -4,13 +4,13 @@ date: 2020-04-12
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Introduction
|
||||
## Introduction
|
||||
|
||||
For this bad malware analysis, I thought I would continue the theme of counting letters ... that way I could use most of my old code :)
|
||||
|
||||
Today, I decided to hash each file using sha512. Hashing is supposed to be completely random, so this is almost a test of that as well. I used around 3000 malicious samples and 1800 benign, so lets get started.
|
||||
|
||||
# Why Hash, Why sha512
|
||||
## Why Hash, Why sha512
|
||||
|
||||
Hashing binaries is done all the time to verify downloads, check for changes, provide signatures, provide low hanging fruit for malware signatures, and many more purposes. It is so widely used, I was wondering if it was possible to use the hash itself as a flag to determine if this could be malware (beyond just a hash table).
|
||||
|
||||
@@ -20,31 +20,31 @@ The reason I decided to do the letter count on hashes was for two reasons; 1) it
|
||||
|
||||
The reason I decided sha512 is also two fold; 1) it's long, so it'll provide some of the most data and 2) sha in general is one of the most accepted hashing algorithms, so I went with that.
|
||||
|
||||
# What Was My Result
|
||||
## What Was My Result
|
||||
|
||||
Surprising! There seems to be a pattern of what characters show up most in hashes for malware.
|
||||
|
||||
# What!
|
||||
## What!
|
||||
|
||||
Yep, it appears that if you see around 3% more f's and 1% more 7's and 5's in your sha512 hash, then you might have some malware.
|
||||
|
||||
## That Can't be Right!
|
||||
### That Can't be Right!
|
||||
|
||||
Hard to believe that is what it seems like 'f, 7, and 5' show up more and 'e and 6' show up 1% less in malware.
|
||||
|
||||
# Ok, So How Was it Done
|
||||
## Ok, So How Was it Done
|
||||
|
||||
## Where are the Samples
|
||||
### Where are the Samples
|
||||
|
||||
Same as my string analysis, to perform my hash analysis, I pulled down around 500 samples of malware from [theZoo](https://thezoo.morirt.com/) and [dasMalwerk](https://dasmalwerk.eu/). For samples of benign software I grabbed all of /bin on Fedora and 200 libraries from C:/Windows directory.
|
||||
|
||||
## How was it Analysed
|
||||
### How was it Analysed
|
||||
|
||||
I modified my program from doing string checks to perform the hash analysis. Now, instead of running strings on each of the files it performs a sha512 hash. I then averaged the number of each character seen for each file. This means I counted the number of '1's seen for all malicious file hashes, then dividing by the total number of files.
|
||||
|
||||
This was done for all characters for each malicious and benign binaries. After that I subtracted the benign averages from the malicious averages and divided by the original value.
|
||||
|
||||
# Why?
|
||||
## Why?
|
||||
|
||||
So a difference of 1 - 2% is not that much, but 3% seems more significant. This shouldn't happen, all characters should show up about evenly. This can probably be accounted for with just the samples that I had chosen. Choose a different set of 1000 binaries and the results could be different.
|
||||
|
||||
|
||||
@@ -4,15 +4,15 @@ date: 2021-03-08T20:20:31Z
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Introduction
|
||||
## Introduction
|
||||
|
||||
Next up in bad malware analysis is comparing the size of a file to the output of of the command strings. The idea here is that malware may contain less strings per KB than benign binaries. This would make logical sense as many malware samples are packed, encrypted, and/or stored in the data section of the binary, to be extracted later. This is done to help obfuscate them from hash signatures.
|
||||
|
||||
# Samples
|
||||
## Samples
|
||||
|
||||
There are around 500 malware samples, coming from two sources: [theZoo](https://thezoo.morirt.com/) and [dasMalwerk](https://dasmalwerk.eu/). For samples of benign software I grabbed 200 libraries from C:/Windows directory.
|
||||
|
||||
# Calculations
|
||||
## Calculations
|
||||
|
||||
Using python I wrote a quick script to count the number of strings returned (separated by a newline) and compared it to the size (in KB) to the file. I performed this using strings of min size 2, 3, 4, 5, and 6. Why those numbers ... because that is where I decided to stop. The average strings per KB was then calculated.
|
||||
|
||||
@@ -24,7 +24,7 @@ Using python I wrote a quick script to count the number of strings returned (sep
|
||||
| 5 | 5.59 | 5.58 | 0.18 % |
|
||||
| 6 | 4.32 | 3.96 | 8.33 % |
|
||||
|
||||
# Results
|
||||
## Results
|
||||
|
||||
The results are kinda in line with what I thought. Most of the malicious binaries have less strings per KB than the benign. Surprisingly looking at a minimum string length of two and five, the benign and malicious binaries have about the same number of strings per KB. The string length of two makes sense as a lot of stings that small come down to random bytes in the binary looking like strings.
|
||||
|
||||
@@ -34,7 +34,7 @@ It appears the sweet spot for comparing malicious to benign binaries is four. At
|
||||
|
||||
Overall the results were in line with what I expected, however they were a lot closer than I thought they would be.
|
||||
|
||||
# Future Work
|
||||
## Future Work
|
||||
|
||||
If this were not bad malware analysis I would continue to look at the individual strings for patterns ... oh wait that was in previous bad malware analysis.
|
||||
|
||||
|
||||
@@ -4,13 +4,13 @@ date: 2020-09-16
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Introduction
|
||||
## Introduction
|
||||
|
||||
Continuing from my Bad Malware Analysis, we now take a look at Bad Password Analysis. Mostly this is just for the fun of it, but we'll see if we can learn anything along the way.
|
||||
|
||||
In this Bad Malware Analysis post, we'll look at consecutive character frequency. I've done analysis on two and three consecutive characters and compared it to a word frequency list generated from subtitle archives.
|
||||
|
||||
# Data
|
||||
## Data
|
||||
|
||||
The passwords come from several leaks. These include honeynet, myspace, rockyou, hotmail, phpbb, and tuscl lists. All of these lists contain the count of how many times a password was used as well. Total there are 14,584,438 unique passwords.
|
||||
|
||||
@@ -18,7 +18,7 @@ For comparison, I'm using an English word frequency list generated from subtitle
|
||||
|
||||
I wrote a quick script to combine these into a single text file, to remove all duplicates and update all counts. This was the list used for all further analysis.
|
||||
|
||||
# Algorithm
|
||||
## Algorithm
|
||||
Everything is written in python.
|
||||
|
||||
A few decisions needed to be made before analyzing the data. The first thing I decided was to not worry about substitutions. So in my analysis @ does not equal a. This is a limitation, since it would provide a more accurate representation of characters in passwords. Second, all passwords and English words where set to lower case. This way patterns would be more apparent. If the goal is cracking, it's incredibly easy to just change cases.
|
||||
@@ -37,24 +37,24 @@ There is an option to turn off the use of frequency. I've analyzed this below as
|
||||
|
||||
Given a starting character, will this analysis allow us to predict what the next character will be?
|
||||
|
||||
# Analysis
|
||||
## Analysis
|
||||
For the analysis, I looked at with and without frequency counts. Within that, I did an internal comparison of frequency of each character set seen as well as a comparison with the dictionary values.
|
||||
|
||||
## With Frequency
|
||||
### With Frequency
|
||||
With frequency taken into account, the top 100 password two character combinations only cover 11% of the all combinations. This seems rather low (I know very technical), so intuition says, this is not a good way to predict the password. In addition the top combination is 's2' which only constitutes 0.15% of combinations.
|
||||
|
||||
Lets compare this to the dictionary words. The top 100 combinations cover a staggering 60% of all combinations. This would be a good predictor for what would be the next letter in English. The top combination in the dictionary data 'th' covers almost 3% of all combinations.
|
||||
|
||||
Comparing the two further we can see 10 character combinations shared in the top 100 password and dictionary characters. I was expecting this to be higher, but this could be due to character substitutions in passwords. Such as 'mo' is in the dictionary top 100 but not for passwords. However 'm0' is in the top 100 password list.
|
||||
|
||||
## Without Frequency
|
||||
### Without Frequency
|
||||
Without taking into account the frequency of words and passwords doesn't change munch. The top 100 passwords now accounts for 35% of all combinations, which seems like it could be a better predictor. But this weighs good unique passwords the same as common ones. Dictionary gets worse at only 45% of combinations accounted for in the top 100.
|
||||
|
||||
Without taking into account frequency, '08' becomes the top password combination at 0.79% and 'se' becomes the top dictionary combination at 1.13%.
|
||||
|
||||
Surprisingly, without taking frequency into account, we see less substitutions in the password data. This means we now see 64 out of 100 duplicates between the data. This is closer to what I would have expected. Most people tend to use dictionary words for their passwords, so it would make sense to see duplicates across the data.
|
||||
|
||||
# Conclusion
|
||||
## Conclusion
|
||||
This is probably not a good way to go about cracking passwords. Mostly this data simply shows to use dictionary word lists and substitution lists.
|
||||
|
||||
We could have done a few things better. One of which is look at common substitutions and see how that changes things. In many of the passwords, the standard alpha characters are replaced by numbers and symbols; such as @ or 4 for a, 5 or $ for s, and so on.
|
||||
|
||||
@@ -4,7 +4,7 @@ date: 2021-03-11T18:55:01Z
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Introduction
|
||||
## Introduction
|
||||
|
||||
For this episode of bad analysis, we are going to be looking at word frequency in passwords. Overall this isn't terrible analysis, but what makes it bad is I'm just looking for the first occurrence of a dictionary word in each password. This will miss *a lot* of words in passwords.
|
||||
|
||||
@@ -17,7 +17,7 @@ Additionally we will miss words because:
|
||||
|
||||
We are missing a lot of words in these passwords, but that is why this is bad analysis.
|
||||
|
||||
# Data
|
||||
## Data
|
||||
|
||||
The passwords come from several leaks. These include honey-net, MySpace, rockyou, hotmail, phpbb, and tuscl lists. All of these lists contain the count of how many times a password was used as well. Total there are 14,584,438 unique passwords.
|
||||
|
||||
@@ -25,9 +25,9 @@ This took forever to loop through, pulling out the words, then comparing them to
|
||||
|
||||
I'm comparing the password list to the American English word list found on Linux. There may be a more complete list somewhere out there, but this worked for me.
|
||||
|
||||
# Results
|
||||
## Results
|
||||
|
||||
## Raw Data
|
||||
### Raw Data
|
||||
|
||||
The word were extracted, counted, and sorted. There were 68,402 unique words, the top 10 words account for around 5% of total words seen, and there were 21,191 unique words only seen in their own password.
|
||||
|
||||
@@ -48,7 +48,7 @@ All percentages are approximate
|
||||
| and | 0.2 % |
|
||||
| ito | 0.2 % |
|
||||
|
||||
## Additional Fun Stuff
|
||||
### Additional Fun Stuff
|
||||
|
||||
How positive are people's passwords. Using a list of positive words found at [Positive List](https://gist.github.com/mkulakowski2/4289437) and a list of negative words found at [Negative List](https://gist.github.com/mkulakowski2/4289441), I've compared to our word frequency from our list.
|
||||
|
||||
@@ -64,7 +64,7 @@ Positive words were used 1,172,617 times and negative words were used 1,172,617.
|
||||
|
||||
Looking at positive and negative occurrences has it's own issues beyond just the word analysis. As you can see there are certain omissions that I would think would be in positive, like "baby." There are also inclusions in negative that I would not have made, such as "mar" which could just be March for someone's birthday. Better lists would need to be found or crafted, or entire passwords would need a language processor to determine if they are negative or positive.
|
||||
|
||||
# Conclusion
|
||||
## Conclusion
|
||||
|
||||
Not much to conclude here, mostly this was for fun. Don't use dictionary words in your password, it doesn't take long to loop through the dictionary, and if you do, try to use longer random words, rather then meaningful ones.
|
||||
|
||||
@@ -72,7 +72,7 @@ People tend to be more positive in their passwords which is nice to see.
|
||||
|
||||
This was a lot of fun to implement and I may come back to this to see if I can improve upon looking at words.
|
||||
|
||||
# Future Work
|
||||
## Future Work
|
||||
|
||||
- Thread all the things, maybe it'll run faster.
|
||||
- Look for more than just the first word in each password
|
||||
|
||||
49
content/posts/code-complete-summations-metaphors.md
Normal file
49
content/posts/code-complete-summations-metaphors.md
Normal file
@@ -0,0 +1,49 @@
|
||||
---
|
||||
title: "Metaphors: Code Complete Summations"
|
||||
date: 2023-11-13
|
||||
draft: false
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
This is the first entry in a new set of summations. Previously we looked at "Secure Coding in C and C++", this current set of summations are going to go over "Code Complete 2" by Steve McConnell. These summations will have a focus on security.
|
||||
|
||||
"Code Complete" uses a set of "metaphors" for describing software development styles. We will look at Penmanship, Farming, and Oyster Farming. We'll look at these and how they could affect the security of the final product.
|
||||
|
||||
## Introduction Part II: Summarizing the Metaphors
|
||||
|
||||
### Penmanship: Writing Code
|
||||
|
||||
Writing a program is like writing a letter. Just sit down, start programming, start to finish.
|
||||
|
||||
### Farming: Growing Code
|
||||
|
||||
Similar to planting one seed at a time, design and develop one piece at a time. Test each piece before moving on.
|
||||
|
||||
### Oyster Farming: Code Accretion
|
||||
|
||||
Code accretion is the slow growth of the whole system. Start with designing the system as a whole, create empty objects and methods, then start adding content and tests.
|
||||
|
||||
## Security Implications
|
||||
|
||||
### Penmanship: Writing Code
|
||||
|
||||
Writing code start to finish, like writing a letter, works great for a small program that only needs to complete one task. However, this quickly doesn't make sense as the complexity of the program grows. When the complexity grows, it's easy to miss pieces or have the need for re-writes as more complexity is added. Testing is also not built into this method.
|
||||
|
||||
Two security concerns are immediately apparent with this method: testing, intra code communication. With no testing built in, it becomes much more difficult to find problems early. Without finding the problems early, they can become hidden by the complexity and harder to find. Also without an overall design, it becomes much harder to have the different pieces of code communicate with each other. Without the coherent design, more bugs could be introduced through the interaction. Parameter limits and return values could get messed up and any changes can cause cascading issues.
|
||||
|
||||
### Farming: Growing Code
|
||||
|
||||
Here we have testing built in, as each piece is not finished until testing is complete. This will alleviate bugs (or as much as possible) in individual parts of the code. However the issue of overall code complexity is still a problem. Interaction between pieces isn't thoroughly tested and any changes can cause cascading issues.
|
||||
|
||||
#### Oyster Farming: Code Accretion
|
||||
|
||||
This metaphor provides the best option for creating secure large scale projects. Starting with the overall design can quickly show how each piece needs to interact, come up with (mostly) stable interfaces into each part, and reveal overall problems that might occur. By writing a skeleton of the code, any piece can be worked on, since it isn't fully reliant on the other parts being complete.
|
||||
|
||||
## Conclusion
|
||||
|
||||
These metaphors can help show the way code being built can affect the final product. They can each have their own security implications, by attempting to stop bugs before they form.
|
||||
|
||||
Opinion time: Farming/Growing Code seems kinda pointless. It's a good way to show that testing is needed, but without an overall design, it's effectively Penmanship with testing. Penmanship is worthwhile when writing small scripts for personal work and Oyster Farming is the best for anything more complex. Code Accretion allows for a coherent design and testing reducing bugs and thus reducing security holes.
|
||||
|
||||
|
||||
@@ -0,0 +1,83 @@
|
||||
---
|
||||
title: "Pre-Requisites (Part I) Initial Design: Code Complete Summations"
|
||||
date: 2023-12-20
|
||||
draft: false
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
Prerequisites are incredibly important to any development project and was the [OWASP Top 10, Number 4, Insecure Design](https://owasp.org/Top10/A04_2021-Insecure_Design/). For the purpose of this document we will talk about it in context of *security implications*.
|
||||
|
||||
## Planning Comes First
|
||||
|
||||
As the saying goes a failure to plan is a plan to fail. Without a solid foundation, similar to building a house, the entire program can fall. With no plan in place, code can end up being added in a half-hazard way, causing code paths to become unknown or unintentionally created. These code paths then become more difficult to maintain.
|
||||
|
||||
Another saying that can apply is an ounce of preparation is worth a pound of cure. This can become even more poignant in software development, from a security perspective. From just a development perspective, this could mean saving time later trying to rewrite code that no longer fits or trying to shoehorn in code that doesn't quite fit. For security this can become a major issue if large security flaws are found in the software.
|
||||
|
||||
## How Prereqs are Used
|
||||
|
||||
### Where is this Software Used
|
||||
|
||||
The first thing to determine is where is this software going to be used: personal-local, personal-network, business, mission-critical, embedded. These are all going to involve vastly different life cycles and security footprint.
|
||||
|
||||
#### Personal Project
|
||||
|
||||
A personal software project is going to have a much smaller security footprint, particularly if it's only going to run locally. Personal projects can also be easily and quickly updated if flaws are found. While an initial plan is still needed, this type of project can be built a little more free form.
|
||||
|
||||
#### Business
|
||||
|
||||
For business code, software will still be updated, but on longer cycles. As such a good plan and set of requirements is important since any changes will take time to deploy. This is true for security problems as well. Businesses tend to be risk adverse and will not want to update on a regular basis. Be sure to try and set the requirements and plan ahead of time.
|
||||
|
||||
#### Mission Critical
|
||||
|
||||
Users will be hard pressed to change this if it's working. If it isn't broken, don't touch. As such it will get updated very infrequently. Here requirements and design are critical to make sure everything is developed properly and securely.
|
||||
|
||||
#### Embedded
|
||||
|
||||
Never will it get updated ... that should be the assumption. These need to have extremely tight requirements set ahead of time with an incredibly secure design. Having a tight plan here would also include using tried and tested external dependencies. The plan should include very granular unit testing as well as integration testing.
|
||||
|
||||
### Iterative vs Sequential
|
||||
|
||||
#### Iterative (as-you-go)
|
||||
|
||||
An initial design is definitely still needed, but the design should be flexible. New requirements will be added all the time.
|
||||
|
||||
This can be used for personal projects and some business applications. It's best for applications that will have little or no cost to changing requirements later. This will work best if all the requirements are not known up front, or if there is an expectation of changing requirements.
|
||||
|
||||
A good business example of this is a security application that monitors a system of security problems. An initial design can be made for communication and how detection will be done, but what will be detected will change overtime.
|
||||
|
||||
#### Sequential
|
||||
|
||||
All requirements and design are complete before coding is done. This is a must for mission critical and embedded software projects. This approach is needed when changing things in the future is difficult or will cost a lot.
|
||||
|
||||
For this method and these types of projects the requirements need to be stable, the design should be straight forward, and future requirements will be predictable.
|
||||
|
||||
Additionally, in my opinion, these should be either small or a series of small projects. Smaller projects tend to be easier to audit and see code paths. This will reduce the possibility for security issues.
|
||||
|
||||
## Defining the Problem
|
||||
|
||||
The first and most important is defining what is trying to be solved. Everything will stem from this, so create as narrow/specific problem as possible.
|
||||
|
||||
The problem should also be easily understandable. Not only should everyone on the development team understand the problem statement, but the customer and users should also understand it. Without a clear problem, requirements will be difficult to define and may include things outside the scope of the project.
|
||||
|
||||
## Defining the Requirements
|
||||
|
||||
Having official requirements helps the user drive development rather than the programmer. This way the project will actually be useful to those using it.
|
||||
|
||||
### Evaluate the Requirements
|
||||
|
||||
STOP and make sure all requirements make sense and are specific enough. If anything doesn't make sense or is too vague, bring those concerns to the customer and have them get more specific. A good design can't happen without good requirements.
|
||||
|
||||
### Prep for Change
|
||||
|
||||
Using a strong problem statement, create an initial design that can handle some changes. The only thing that stays the same is that everything changes. Having a flexible design can help with those changes. An example could be having different parts of the code run independently, but have a strong stable design for communication between the parts.
|
||||
|
||||
### Change Control
|
||||
|
||||
Customers are going to want more, have a procedure in place to handle those requests. By having a formal request process, this can help filter vague or bad requests before hitting the developers.
|
||||
|
||||
With a strong problem statement, there can be a way to push back against requests as well. Any request that goes outside the scope of the problem statement does not get put into the current project.
|
||||
|
||||
## Conclusion
|
||||
|
||||
The initial problem statement and set of requirements can make or break a project's security. By having a narrowly defined problem and set of requirements, it's much easier to design a system. With a robust design, security can be included from the beginning.
|
||||
@@ -0,0 +1,63 @@
|
||||
---
|
||||
title: "Pre-Requisites (Part II) Initial Design: Code Complete Summations"
|
||||
date: 2023-12-26
|
||||
draft: false
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
Prerequisites are incredibly important to any development project and was the [OWASP Top 10, Number 4, Insecure Design](https://owasp.org/Top10/A04_2021-Insecure_Design/). For the purpose of this document we will talk about it in context of *security implications*.
|
||||
|
||||
This ended up being too big of a topic for just one post, so here is part 2. In [Pre Requisets Part 1](/posts/code-complete-summations-pre-requisets-part-1), we looked at why pre-reqs are needed in general and how they apply to types of projects. In part 2 we'll look at architectural pre-reqs.
|
||||
|
||||
## Planning Comes First
|
||||
|
||||
As the saying goes a failure to plan is a plan to fail. Without a solid foundation, similar to building a house, the entire program can fall. With no plan in place, code can end up being added in a half-hazard way, causing code paths to become unknown or unintentionally created. These code paths then become more difficult to maintain.
|
||||
|
||||
Another saying that can apply is an ounce of preparation is worth a pound of cure. This can become even more poignant in software development, from a security perspective. From just a development perspective, this could mean saving time later trying to rewrite code that no longer fits or trying to shoehorn in code that doesn't quite fit. For security this can become a major issue if large security flaws are found in the software.
|
||||
|
||||
## Why Architectural Prerequisites
|
||||
|
||||
General pre-reqs need to be generic enough that the customer and users will be able to understand what is required. Architectural requirements are for the developers themselves. This is hugely important as it keeps the code base consistent and easy to maintain. From a security perspective this is vital as the less chaos in the code the less mistakes there will be. Also, by making it more maintainable, if bugs or security flaws are found, it will require less time and effort to fix.
|
||||
|
||||
By having a solid architectural foundation, this will allow the developers to break up work where appropriate. See [Data Design](#data-design) for more detail.
|
||||
|
||||
## Architectural Features
|
||||
|
||||
Architectural designs can be broken down into multiple pieces, each with their own considerations.
|
||||
|
||||
### Communication
|
||||
|
||||
How will this software communicate between components in the project and external to the project, both protocol and data structure.
|
||||
|
||||
Protocols are vital here as they will help determine how secure communications between programs or across networks will be. You'll want to pick something that either has encryption by default or can be easily added. Authentication is also a must. Some protocols or services have built in authentication methods, while others will need to be worked into the initial connection. These things need to be thought about ahead of time before diving in.
|
||||
|
||||
The data structure is critical to have coordinated between each piece involved. See [section](#data-design) for more detail.
|
||||
|
||||
### Major Classes
|
||||
|
||||
Creating a skeleton of all the major classes will go a long way to ensuring good design, which in turn helps with keeping the project secure. By creating the skeleton it becomes more obvious what is missing and where each component will live. By having an experienced engineer design and build the skeleton, it also becomes easier to have junior devs take over the actual implementation.
|
||||
|
||||
### Data Design
|
||||
|
||||
The way the data is designed can have a major impact on security. There are different types of data to consider when designing a secure system. Any data that is considered sensitive, such as PII, should be encrypted both at rest and in transit. Any data that should not be able to be altered by a user should probably also be encrypted both at rest and in transit.
|
||||
|
||||
Data that can/should be readable or editable by the user does not need to be encrypted at rest.
|
||||
|
||||
All data should be encrypted in transit to reduce the possibility of man-in-the-middle reading or altering the content.
|
||||
|
||||
The data design needs to also be agreed upon by all parties using the data. If the design and restrictions are being used differently on both ends, this will cause read/processing errors if both expect different things.
|
||||
|
||||
### User Interface
|
||||
|
||||
The UI needs to be considered a separate component that uses an API to communicate with the backend. This moduler approach will allow flexibility, as well as naturally leads to security in depth. If the UI is treated separate, then any user input should be sanitized at the UI side. Since it's considered a separate component, then all user input should be sanitized on the backend as well.
|
||||
|
||||
### Error Processing and Logging
|
||||
|
||||
Here is a big one. Error handling needs to be designed from the beginning. If errors are handled through exceptions, integer returns, parameters passed by reference, etc, this will lead to confusion in the code base and errors will be missed. There should be one design for errors used universally across the project, so all developers know how to handle the errors.
|
||||
|
||||
As for logging, this needs to be taken into account for two big reasons. The first is to have the appropriate amount to diagnose and fix errors. The second is how much data and what data will be displayed, as this could lead to unintentionally leaking secure information.
|
||||
|
||||
## Conclusion
|
||||
|
||||
Having a good design from the beginning can help prevent problems before they even arise. Then if bugs and security issues are found, having a good architecture will help to locate those issues faster.
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: "Pre-Requisites (Part III) Initial Design: Code Complete Summations"
|
||||
date: 2024-03-05
|
||||
draft: false
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
Prerequisites are incredibly important to any development project and was the [OWASP Top 10, Number 4, Insecure Design](https://owasp.org/Top10/A04_2021-Insecure_Design/). For the purpose of this document we will talk about it in context of *security implications*.
|
||||
|
||||
This ended up being too big of a topic for just two posts, so here is part 3. In [Pre Requisets Part 1](/posts/code-complete-summations-pre-requisets-part-1), we looked at why pre-reqs are needed in general and how they apply to types of projects. In [Pre Requisets Part 2](/posts/code-complete-summations-pre-requisets-part-2) we looked at how various pre-reqs for architecture helps with security. In part 3 we'll look at resource and error management pre-reqs.
|
||||
|
||||
## Planning Comes First
|
||||
|
||||
As the saying goes a failure to plan is a plan to fail. Without a solid foundation, similar to building a house, the entire program can fall. With no plan in place, code can end up being added in a half-hazard way, causing code paths to become unknown or unintentionally created. These code paths then become more difficult to maintain.
|
||||
|
||||
Another saying that can apply is an ounce of preparation is worth a pound of cure. This can become even more poignant in software development, from a security perspective. From just a development perspective, this could mean saving time later trying to rewrite code that no longer fits or trying to shoehorn in code that doesn't quite fit. For security this can become a major issue if large security flaws are found in the software.
|
||||
|
||||
## Resource Management
|
||||
|
||||
Resource management includes not just how much memory or processing power is used, but also data base connections, threading, and file handles. These are vital to security as they are dealing with how data is accessed and processed.
|
||||
|
||||
By planning ahead with resource management a lot of the issues can be avoided from the beginning. If there is a failure to take this into account, the code will need to be retrofitted to fix any security issues. This could lead to inconsistent handling of resources or old code being left behind. Planning ahead is the best way to combat these security problems.
|
||||
|
||||
### Databases
|
||||
|
||||
Databases require a thoughtful setup. Using a secure password is only the start, encrypting the database (particularly file based DB, such as SQLite) should be considered. If this database is remote, having a secure connection is necessary.
|
||||
|
||||
In addition, how the data is accessed needs to be considered as well. This includes sanitizing using input, when and how to update data, and read and write sequences. Messing up these could cause leaks or corruption of data.
|
||||
|
||||
### Threading
|
||||
|
||||
Threading and data access is relevant for corrupted data. The data could be corrupted if read/write sequences are off. If two writes occur at the same time or data is read as a write is happening, the data could be corrupted. Shared variables across threads could also cause security issues sucha as use-after-free, double free, or access data outside of scope.
|
||||
|
||||
### File Handles
|
||||
|
||||
In a previous post [File IO](/posts/secure-coding-in-c-summations-file-io) we discussed in detail why file access is a security issue. By designing out from the beginning how files are going to be accessed will greatly reduce those security issues.
|
||||
|
||||
## Error Processing
|
||||
|
||||
Being able to handle errors properly is critical for security. There are a few questions that need to be answered that will have impact on how the software is architected:
|
||||
|
||||
- Corrective vs Detective: Essentially are you going to try to fix whatever error happens. If so, you need to make sure to handle *all* issues, since missing one could allow a hole through the program
|
||||
- Active vs Passive: Active detection could allow for correction, but requires all paths to be checked, whereas passive could cause crashes
|
||||
- How to propagate errors: Is each method going to check it's own errors or push them up. When do these errors get checked or pushed up?
|
||||
- What is the convention for handling errors: Output to a log, correct the error, try and return useful data even when there is an error?
|
||||
|
||||
These questions shouldn't be taken lightly. Each one should be considered since they could cause security issues. In addition just keeping things consistent is good for security since all developers will know what to expect from another developers code.
|
||||
|
||||
## Conclusion
|
||||
|
||||
Having a good design from the beginning can help prevent problems before they even arise. Then if bugs and security issues are found, having a good architecture will help to locate those issues faster.
|
||||
51
content/posts/code-complete-summations-variable-names.md
Normal file
51
content/posts/code-complete-summations-variable-names.md
Normal file
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: "Variable Usage: Code Complete Summations"
|
||||
date: 2024-02-23
|
||||
draft: false
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
In this summation of "Code Complete 2" by Steve McConnell we will focus on variable naming and usage and how it ralates to security. Variable naming is an essential aspect of software development, and it plays a critical role in ensuring software security.
|
||||
|
||||
## Importance of Variable Naming
|
||||
|
||||
Variable naming is important for software security because it helps to prevent common programming errors that can lead to security vulnerabilities. For example, if a variable is named incorrectly, it can be difficult to understand its purpose, which can lead to confusion and errors in the code. This can make it easier for attackers to exploit vulnerabilities in the software.
|
||||
|
||||
In addition, poorly named variables can make it difficult to identify and fix security vulnerabilities. For example, if a variable is named "userInput," it may not be immediately clear that it contains sensitive data that needs to be properly validated and sanitized. This can lead to security vulnerabilities, such as SQL injection or cross-site scripting (XSS) attacks.
|
||||
|
||||
On the other hand, well-named variables can help to prevent security vulnerabilities by making it clear what data the variable contains and how it should be used. For example, a variable named "sanitizedUserInput" clearly indicates that the data has been sanitized and is safe to use in a SQL query or HTML page.
|
||||
|
||||
## How to Name Variables
|
||||
|
||||
There are several things to keep in mind when naming variables:
|
||||
|
||||
1. Use descriptive names: Use variable names that accurately describe the data they contain. For example, instead of using a name like "x," use a name like "sanitizedUserInput" to indicate that the data has been sanitized and is safe to use.
|
||||
2. Avoid ambiguous names: Avoid using variable names that are ambiguous or confusing. For example, avoid using names like "data" or "info," as they don't provide any information about the data the variable contains.
|
||||
3. Use meaningful prefixes: Use meaningful prefixes to indicate the type of data a variable contains. For example, use a prefix like "sanitized" to indicate that the data has been sanitized and is safe to use.
|
||||
4. Use consistent naming conventions: Use consistent naming conventions throughout your code. This will make it easier to understand and maintain your code, and will help to prevent errors and security vulnerabilities.
|
||||
5. Avoid using sensitive data in variable names: Avoid using sensitive data, such as user passwords or credit card numbers, in variable names. This will help to protect the data from being accidentally exposed or leaked.
|
||||
|
||||
## Variable Usage
|
||||
|
||||
Another important aspect of variables is generally how they are used.
|
||||
|
||||
### Position
|
||||
|
||||
Similar to how good variable names helps with the readability and maintainability of the codebase, so does variable position. By declaring a variable close to its usage, keeps the code organized. It also helps with making sure variables are freed when leaving scope.
|
||||
|
||||
By keeping the position close to use, we can also keep the time to "live" short as well. The less code a particular variable spans, the less likely it is to be miss-used.
|
||||
|
||||
### Initialization
|
||||
|
||||
All variables should be initialized as they are declared as well. By doing so, we avoid the situation of attempting to use an empty variable. This is particularly important for pointers as it can cause memory leaks or out of bounds writes.
|
||||
|
||||
### One Purpose
|
||||
|
||||
When declaring and using a variable, make sure it's only used for one specific purpose. If the reason for a variable's existence changes part way through, it can make the code base very confusing and hard to maintain. It can also lead to mistaken identity, which will cause errors.
|
||||
|
||||
This is especially problematic for non-strongly typed languages, as not only the purpose, but type, of the variable could change.
|
||||
|
||||
## Conclusion
|
||||
|
||||
In conclusion, variable naming and usage are important aspect of software security. By using descriptive and meaningful variable names, you can help to prevent common programming errors and security vulnerabilities. by keeping variables close, short, and single purposed, you increase the maintainability of the codebase and reduce the possibility of misuse.
|
||||
@@ -4,17 +4,17 @@ date: 2023-03-30
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Introduction
|
||||
## Introduction
|
||||
|
||||
Currently one of my projects uses "pinned" certs to securely communicate back to a REST service. These are pinned to allow for truly secure authentication of the server, eliminating a rogue certificate authority (CA) to issue a fake cert and allow for man-in-the-middle (MITM) attacks. This is a huge hassle as the server and client need to stay in sync. This involves cutting a new release just to update certs and trying to get them deployed in the expiration/reissue window. [Enrollment over Secure Transport](https://www.rfc-editor.org/rfc/rfc7030.html) (EST) should provide a better way to issue certs from the server so the client just has to request the new ones and download them.
|
||||
|
||||
# What is EST
|
||||
## What is EST
|
||||
|
||||
EST allows a client to authenticate to the EST server, which then delivers a client cert. This could be unique to the client or generic for all clients. Issued certificates can then be used to re-authenticate to the EST to get the updated cert. By having this re-authentication method, a client can automatically get the most up-to-date cert in a secure way. By not having it compiled into the binary (i.e. pinning) a new release is not needed to simply update the cert.
|
||||
|
||||
To do this, the client authenticates to the EST server, either via public/private key pair or username/password, and the client authenticates the server, either through the same public/private key challenge or external CA. Once authenticated, the EST server will issue the correct cert. All communication is over a TLS connection.
|
||||
|
||||
# Possible Setup
|
||||
## Possible Setup
|
||||
|
||||
First, no to username/password. With username/password authentication, the client will be reliant on an external CA to authorize the server, which is what "pinning" was supposed remove. So, if username/password is used, there is no real need for an EST server and the client can just connect directly to the server (for our use case).
|
||||
|
||||
@@ -33,7 +33,7 @@ Cons of a separate key
|
||||
|
||||
Being able to easily revoke and re-issue a private key is the deciding factor for me. This is the true solve to the problem of pinning. Building in the private key helps with the pinning issue as it doesn't need to be updated as frequently, but it really just delays the issue. Yes it's more work for the client to get everything setup, but a little inconvenience shouldn't get in the way of good security.
|
||||
|
||||
# Final Proposal
|
||||
## Final Proposal
|
||||
|
||||
The final setup could look something like this:
|
||||
|
||||
@@ -50,7 +50,7 @@ Once the software is installed it would:
|
||||
1. Client uses TLS cert to connect and authenticate to backend server
|
||||
1. When TLS cert expires, it can be used to re-auth with the EST and download the next TLS cert
|
||||
|
||||
# Conclusion
|
||||
## Conclusion
|
||||
|
||||
Using this method of authentication with a pub/priv key pair to an EST, then using the issued TLS cert for authentication is the best way to remove the need for pinned certificates and username/passwords. The private key is the primary way the client authenticates, since it uses that key pair to get the TLS cert. Using the TLS cert for authentication makes it so a client doesn't need to continuously update passwords. By having the private key separate from the binary, and the TLS cert for authentication, it becomes relatively simple to re-issue creds when a system is compromised.
|
||||
|
||||
|
||||
@@ -4,15 +4,15 @@ date: 2019-09-26
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Introduction
|
||||
## Introduction
|
||||
|
||||
In this post we will explore a brief overview of the fast-flux (FF) technique used by botnets. [Here is my full paper](/security/FastFluxPaper.pdf) with more detail regarding what a botnet is and how FF works.
|
||||
|
||||
# Botnet Overview
|
||||
## Botnet Overview
|
||||
|
||||
Botnets are a major threat to all those connected to the Internet. They are used for distributing spam, hosts for malicious code, sending phishing attacks, and performing a variety of attacks, including denial of service (DOS). Many botnets will use DNS names to control or connect to the botnet. This would seemingly be easy to shutdown, just block the particular domain, however through a technique called fast-flux (FF), botnets are able to evade detection and mitigation.
|
||||
|
||||
# Fast Flux Overview
|
||||
## Fast Flux Overview
|
||||
|
||||
Fast-flux is the process of quickly changing the domain name or IP addresses associated with a domain in order to hide the bot-master, or command and control (CC), for the botnet. These fast changes are accomplished through two primary technologies, dynamic DNS (DynDNS) and round robin.
|
||||
|
||||
@@ -24,11 +24,11 @@ Round robin was a technique developed for load balancing. Sites that see a large
|
||||
|
||||
In addition to DynDNS and round-robin, some botnets will be double-fluxed. In this technique a botnet will setup its own name servers and rotate through them as well. More detail is in the paper.
|
||||
|
||||
# Detection/Mitigation
|
||||
## Detection/Mitigation
|
||||
|
||||
There are two primary ways of detecting and mitigating fast-fluxing botnets that need to be used in conjunction. The first is to look at the time to live (TTL) for DNS entries to be cashed. Fast-fluxing botnets tend to use very short TTL values compared to legitimate domains. The second is keeping a "FF Activity Index" or how often name-address relationships change. The "FF Activity Index" will hold both how often the IP address for a given domain changes and how often domains change for a single IP address. Even looking at these two indicators still yields a number of false positives. More details in the paper.
|
||||
|
||||
# Conclusion
|
||||
## Conclusion
|
||||
|
||||
Botnets are getting more sophisticated and more research is needed to detect these techniques. The best way to block these connections is to attempt to stop the CC directly. Most hide behind proxies and many use FF techniques to hide those. FF is an arms race between detection and ever more sophisticated ways of hiding activities.
|
||||
|
||||
|
||||
38
content/posts/pseudo-random-number-generators.md
Normal file
38
content/posts/pseudo-random-number-generators.md
Normal file
@@ -0,0 +1,38 @@
|
||||
---
|
||||
title: "Pseudo Random Number generators"
|
||||
date: 2024-03-22
|
||||
draft: false
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
Pseudo-random number generators (PRNGs) play a crucial role in modern cryptography and information security. These algorithms generate seemingly random sequences of numbers, which are essential for tasks like encryption, secure key generation, and digital signatures. PRNGs in the past have had many issues with predictability. Looking at the current and future research requires a look at how predictable the numbers really are.
|
||||
|
||||
## External Techniques
|
||||
|
||||
Several techniques have arisen to generate random numbers, both on local machines and using real world chaos. There are a few ways to integrate physical phenomenon in the real world to generate random numbers.
|
||||
|
||||
### Lava-lamps
|
||||
|
||||
[Lavarland](https://en.wikipedia.org/wiki/Lavarand) uses a video of a wall of lavalamps to generate random numbers. It does show by taking a high definition screenshot of the video feed. It then hashes that image to generate a seed for a PRNG. The more random the seeds the more random the number that will be generated. Since the lavalamps, particularly accumulated over all lamps, is unpredictable, the seed is also unpredictable.
|
||||
|
||||
### Radioactive Decay
|
||||
|
||||
Using Geiger Meters to detect background decay of radioactive material allows the generation of random seeds as well. As far as we currently know, radioactive decay has no distinct pattern and thus, unpredictable. Using this to generate seeds for PRNGs will generate random numbers.
|
||||
|
||||
### Background Sound
|
||||
|
||||
Another physical phenomenon that is difficult to predict is background noise. It's almost impossible to predict not just what will be making sound at any given moment, but also the direction, intensity, and frequency of that sound. By hashing background noise a random seed can be generated, making it almost impossible to predict the output of a PRNG.
|
||||
|
||||
## Internal Techniques
|
||||
|
||||
Not all personal computers have access to these physical phenomenon. If they don't have access to a camera, microphone, network connection, or Geiger counters, there are sensors that most computers have that can be used. Most motherboards and graphics cards have both power meters and temperature sensors. By taking as accurate a measurement as possible on temperature, electrical pull, fan speeds, and time can produce fairly unpredictable values. Some of these values can be correlated (i.e. higher electrical pull, will lead to higher temperatures, which leads to higher fan speed), but should produce numbers that are unpredictable enough. Using all these values together is a good way to generate a seed to use.
|
||||
|
||||
Another way is to track user movements. By having a user move around the mouse pointer or type on the keyboard, can help generate a random seed. By tracking pointer position, acceleration, and speed, keyboard keys pressed and speed they are pressed, a fairly random seed can be generated.
|
||||
|
||||
These internal techniques do not require permissions to external resources.
|
||||
|
||||
## Conclusion
|
||||
|
||||
Most PRNGs used simple seeds in the past, usually just time of run. Current new techniques create a more random number by using real world conditions. By using these conditions to generate seeds, it provides a better pseudo random number.
|
||||
|
||||
@@ -4,17 +4,17 @@ date: 2020-04-17
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Introduction
|
||||
## Introduction
|
||||
|
||||
After reading through "Silence on the Wire" by Michal Zalewski for the 8th time, I decided I wanted to try the random algorithm analysis he did in Chapter 10. He looked at the relationship between sequential numbers by graphing them in a 3D scatter plot. My idea was to see if any of the algorithms had been updated to make them more secure.
|
||||
|
||||
There was a problem with that however. I only own one computer and it's too low power to run VMs. So I was stuck with the python algorithm, shuf, urandom, and two online random number generators. This was a big limitation and I hope to update this whenever I get a new computer.
|
||||
|
||||
# The Importance
|
||||
## The Importance
|
||||
|
||||
Random algorithms cannot be predictable for security reasons. All encryption algorithms use random digits to generate keys. If the keys are predictable, than encryption can be broken. In "Silence on the Wire" it showed some random algorithms having limited range or predictable patterns to reduce the search space. Luckily the new algorithms seem to be doing better.
|
||||
|
||||
# The Math
|
||||
## The Math
|
||||
|
||||
Using the math in "Silence on the Wire" to create the graphs allows me to compare more directly to Mr Zalewski's. Of course this ended up not really mattering, since I was so limited. For a better explanation see the book Chapter 10, but here is a quick run down. Using data samples S0, S1, S2, being a randomly generated sequence, then calculate the deltas and graph those.
|
||||
|
||||
@@ -33,13 +33,13 @@ Then we graph the deltas in a 3D scatter plot using the following points:
|
||||
.
|
||||
.
|
||||
|
||||
# The Samples
|
||||
## The Samples
|
||||
|
||||
The data came from the following locations; JS Math, Python's numby package, random.org, Bash shuf, and urandom. Here are the graphs that were produced ... don't get excited, they are all basically the same:
|
||||
|
||||
Unfortunetally my blog server crashed, so I've lost the images for now, I'll add them in later. The long and short is they all look basically the same.
|
||||
|
||||
# Conclusion
|
||||
## Conclusion
|
||||
|
||||
Why are these all basically the same ... probably because they all use the same exact algorithm. I was hoping Python had it's own built in PRNG, but it appears to use whatever the host uses. The shuf command and urandom make sense that they are the same. Shuf is kinda just a wrapper around urandom to give the user more control.
|
||||
|
||||
|
||||
@@ -4,7 +4,7 @@ date: 2022-12-06
|
||||
draft: false
|
||||
---
|
||||
|
||||
# INTRODUCTION
|
||||
## INTRODUCTION
|
||||
|
||||
RSA is a public key cryptosystem, which was named after the creators of
|
||||
the algorithm: Rivest, Shamir, and Adleman [@STALLINGS]. It is widely
|
||||
@@ -55,7 +55,7 @@ decrypting messages. However the same instruction set architecture we
|
||||
propose in this paper can be used for signing and verifying messages
|
||||
with RSA.
|
||||
|
||||
# CHARACTERISTICS OF RSA
|
||||
## CHARACTERISTICS OF RSA
|
||||
|
||||
There are three areas that the RSA can be optimized: finding the
|
||||
encryption and decryption exponent, prime number generation, and
|
||||
@@ -64,7 +64,7 @@ finding the encryption and decrypting exponent, prime number generation,
|
||||
and encrypting and decrypting the message is usually done, without
|
||||
specialized instructions.
|
||||
|
||||
## ENCRYPTION AND DECRYPTION EXPONENT
|
||||
### ENCRYPTION AND DECRYPTION EXPONENT
|
||||
|
||||
The current approach to verifying that the encryption exponent is
|
||||
coprime to the `φ(n)` is by using the Euclidean Algorithm. To find
|
||||
@@ -150,7 +150,7 @@ we will use two instructions that will combine some of the instructions
|
||||
used in the Extended Euclidean Algorithm to reduce the number of stalls
|
||||
within the loop.
|
||||
|
||||
## PRIME NUMBER GENERATION
|
||||
### PRIME NUMBER GENERATION
|
||||
|
||||
The common approach to generating large primes in making encryption and
|
||||
decryption keys is to randomly select integers and test them for
|
||||
@@ -199,7 +199,7 @@ shown below). The instructions to calculate `x^{2} (mod n)` would be
|
||||
executed `s` times. These two factors indicate a heavy reliance on the
|
||||
ability of a system to calculate exponentiation.
|
||||
|
||||
## ENCRYPTION AND DECRYPTION
|
||||
### ENCRYPTION AND DECRYPTION
|
||||
|
||||
One aspect of RSA to improve upon is performing large exponentiation.
|
||||
Currently the implementation of exponentiation is performed by the
|
||||
@@ -222,13 +222,13 @@ to be available. We can lessen the number of stalls using a technique
|
||||
known as exponentiation by squaring. This technique is explained further
|
||||
in the design section.
|
||||
|
||||
# DESIGN
|
||||
## DESIGN
|
||||
|
||||
In this section, we will describe specialized instructions that will be
|
||||
used for prime number generation, computing the encryption and
|
||||
decryption exponent, and encrypting a decrypting a message.
|
||||
|
||||
## ENCRYPTION AND DECRYPTION EXPONENT
|
||||
### ENCRYPTION AND DECRYPTION EXPONENT
|
||||
|
||||
The issue with implementing the Euclidean Algorithm the traditional way
|
||||
is a divide, multiply, and subtract instructions are needed for each
|
||||
@@ -320,7 +320,7 @@ an analysis to the speedup given to the Extended Euclidean Algorithm by
|
||||
using the modular instruction and the multiply-subtract instruction in
|
||||
the justification and analysis section.
|
||||
|
||||
## PRIME NUMBER GENERATION, ENCRYPTION, AND DECRYPTION
|
||||
### PRIME NUMBER GENERATION, ENCRYPTION, AND DECRYPTION
|
||||
|
||||
One issue already discussed in the previous section is that of stalls
|
||||
during large exponents. The way exponentiation is handled causes many
|
||||
@@ -375,12 +375,12 @@ depending on the digit, may use the second accumulator. It will then run
|
||||
through one more multiplier to multiply the squares by the 1's
|
||||
multiplier.
|
||||
|
||||
# JUSTIFICATION AND ANALYSIS
|
||||
## JUSTIFICATION AND ANALYSIS
|
||||
|
||||
In this section, we will describe how our specialized instructions will
|
||||
improve the performance of the RSA encryption.
|
||||
|
||||
## ENCRYPTION AND DECRYPTION EXPONENT
|
||||
### ENCRYPTION AND DECRYPTION EXPONENT
|
||||
|
||||
Using the modular instruction in the Euclidean Algorithm, we can reduce
|
||||
the number of stalls needed. Instead of needing to stall for the result
|
||||
@@ -441,7 +441,7 @@ speedup of 1.23. Also, an advantage to using the modular and
|
||||
multiply-subtract instructions is we reduce the number of temporary
|
||||
registers needed from five to three.
|
||||
|
||||
## PRIME NUMBER GENERATION, ENCRYPTION, AND DECRYPTION
|
||||
### PRIME NUMBER GENERATION, ENCRYPTION, AND DECRYPTION
|
||||
|
||||
Using the pow command we can reduce stalls of large exponents in half.
|
||||
Since the algorithm breaks the exponent into a binary representation of
|
||||
@@ -472,7 +472,7 @@ the following equations to determin the overall speedup:
|
||||
Speedup = 1.25/1.125
|
||||
Speedup = 1.11
|
||||
|
||||
# CONCLUSIONS
|
||||
## CONCLUSIONS
|
||||
|
||||
In analyzing the typical algorithms used as a part of RSA, we have
|
||||
identified two primary bottlenecks in both encryption and decryption:
|
||||
@@ -502,7 +502,7 @@ system with the sole task of encrypting and decrypting messages using
|
||||
RSA, we can create hardware that allows the specialized instructions to
|
||||
have a latency comparable to the traditional instructions.
|
||||
|
||||
# Bibliography
|
||||
## Bibliography
|
||||
|
||||
3 Beauchemin, Pierre, Brassard, Crepeau, Claude, Goutier, Claude, and
|
||||
Pomerance, Carl The Generation of Random Numbers That Are Probably Prime
|
||||
|
||||
@@ -4,27 +4,27 @@ date: 2023-01-27
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Introduction
|
||||
## Introduction
|
||||
|
||||
Continuing summarizing the themes in "Secure Coding in C and C++" by Robert C. Seacord, we will discuss concurrency. When code runs at the same time needing access to the same resources lots of issues can occur. These can be from the annoying of getting the incorrect data, halting deadlocks, to vulnerabilities.
|
||||
|
||||
The tl;dr; use `mutex`'s. There are a lot of methods for controlling concurrency, but many use `mutex`'s in the background anyway. A `mutex` is the closest thing to guaranteed sequential access, without risking deadlocks.
|
||||
|
||||
## Importance
|
||||
### Importance
|
||||
|
||||
To quote Robert C. Seacord, "There is increasing evidence that the era of steadily improving single CPU performance is over. ... Consequently, single-threaded applications performance has largely stalled as additional cores provide little to no advantage for such applications"
|
||||
|
||||
In other words, the only real way to improve performance is through multi-threaded/multi-process applications, thus being able to handle concurrence is very important.
|
||||
|
||||
# The Big Issue
|
||||
## The Big Issue
|
||||
|
||||
Race Conditions! That's the big issue, when two or more threads or processes attempt to access the same memory or files. The issue comes in when; two writes happen concurrently, reads occur before writes, reads occur during writes. This can lead to incorrect values being read, incorrect values being set, or corrupted memory. These types of flaws, and insufficient fixes can cause vulnerabilities in the programs as well.
|
||||
|
||||
# How Do We Keep Memory Access Sane
|
||||
## How Do We Keep Memory Access Sane
|
||||
|
||||
So what is the fix. There are several possible ways to keep things in sync, but the number one way that will "always" work is a `mutex`. In fact most of the other "solutions" are just an abstracted `mutex`. We will go briefly over a couple solutions: global variables, `mutex`, and atomic operations.
|
||||
|
||||
## Shared/Global Variables
|
||||
### Shared/Global Variables
|
||||
|
||||
A simple solution, that is **NOT** robust is simply having a shared "lock" variable. A variable, we'll call `int lock`, which is a `1` when locked and `0` when unlocked, is accessible between threads. When a thread wants to access a memory location it simply checks that the variable is in the unlocked state, `0`, locks it by setting it to `1`, then accessing the memory location. At the end of it's access, it simply sets the variable back to `0` to "unlock" the memory location.
|
||||
|
||||
@@ -36,7 +36,7 @@ The third issue is compiler optimization (future blog coming regarding that hot
|
||||
|
||||
The third issue *can* be solved through compiler directives, but that still doesn't solve the first two issues.
|
||||
|
||||
## `mutex`
|
||||
### `mutex`
|
||||
|
||||
Fundamentally, a `mutex` isn't much different than a shared variable. The `mutex` itself is shared among all threads. The biggest difference is, it doesn't suffer from either of the three issues. The `threading` library handles things properly such that a "check" on the `mutex` and a "lock" happen atomically (meaning that nothing can happen in between). This handles the issue of reading the variable before another thread writes and the compiler trying to optimize things. `mutex`es also handle waiting a little different thus need less CPU to wait.
|
||||
|
||||
@@ -44,13 +44,13 @@ The only drawback to the `mutex` is that it can still cause a *deadlock* when no
|
||||
|
||||
To solve the possible *deadlock* of not unlocking the `mutex`, `automic` operations were added.
|
||||
|
||||
## Atomic Operations
|
||||
### Atomic Operations
|
||||
|
||||
Atomic operations attempt to solve the issue of forgetting to unlock the `mutex`. An atomic operation is a single function call that perform multiple actions on a single shared variable. These operations can be checking and setting (thus making them semi useful as a shared locking variable), swapping values, or writing values.
|
||||
|
||||
Atomic operations are very limited in their use case, since there is only so many built in methods. If they work for your use case there really isn't much down side to using them. However since they are limited and use a `mutex` in the background anyway, a `mutex` with proper error handling and releasing is probably the best way to go.
|
||||
|
||||
## Other Solutions
|
||||
### Other Solutions
|
||||
|
||||
Lock Guard:
|
||||
- C++ object that handles a `mutex`, useful for not having to worry about unlocking the `mutex`, only real downside is it's C++ only
|
||||
@@ -61,21 +61,21 @@ Fences:
|
||||
Semaphore:
|
||||
- `mutex` with a counter. Can have good specific use cases, but just uses a `mutex` in the background. Unless needed, just use a `mutex`
|
||||
|
||||
# Obvious bias is Obvious
|
||||
## Obvious bias is Obvious
|
||||
|
||||
Just use a `mutex`. Most of the additional solutions either are simply a `mutex` in the background or cause other issues. A `mutex` will just work. Just be sure to properly unlock when done and possibly have timeouts in case another thread gets stuck.
|
||||
|
||||
With a `mutex` you have way more control over the code and way more flexibility in how it's used. An arbitrary amount of code can be put in between without having to finagle a use case into a limited number of function calls.
|
||||
|
||||
# Keep it Sane
|
||||
## Keep it Sane
|
||||
|
||||
There is one additional tip for concurrency, lock as little code as possible. By having as few operations as possible between a `mutex` lock and unlock it reduces possibilities of timeouts, gridlocks, and crashing. It also helps to reduce the possibility of forgetting to unlock. Do not surround an entire method (or methods) with locks, rather just the read and write operations.
|
||||
|
||||
## The Dreaded `GOTO`
|
||||
### The Dreaded `GOTO`
|
||||
|
||||
When it comes to locking, `goto` is your friend. Have an error or exception inside a lock, `goto` the unlock. This also works for clearing memory, have an error `goto` the `free` and memory cleanup. Keep the `goto` sane by only jumping to within the current method.
|
||||
|
||||
# Conclusion
|
||||
## Conclusion
|
||||
|
||||
Just use a `mutex`, everything else is either more prone to errors, more limiting, or just uses a `mutex` in the background anyway. Keep things sane by locking as little code as possible. And always make sure to throw locks around accessing common memory space
|
||||
|
||||
|
||||
63
content/posts/secure-coding-in-c-summations-file-io.md
Normal file
63
content/posts/secure-coding-in-c-summations-file-io.md
Normal file
@@ -0,0 +1,63 @@
|
||||
---
|
||||
title: "Concurrency: Summations of Secure Coding in C and C++"
|
||||
date: 2023-06-29
|
||||
draft: false
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
Continuing summarizing the themes in "Secure Coding in C and C++" by Robert C. Seacord, we will discuss file I/O and how to prevent unauthorized access. File I/O is especially dangerous when a program is running under a privileged context and accesses files that unprivileged users can access. This can lead an attacker to read or even overwrite privileged files.
|
||||
|
||||
The tl;dr; here is, use proper file permissions, verify file paths, and use the principle of least privilege.
|
||||
|
||||
This post assumes basic knowledge of file system permissions and how paths are determined.
|
||||
|
||||
## Big Issues
|
||||
|
||||
There are several issues that can arise while attempting to access files on the system:
|
||||
|
||||
1. [User input providing paths they shouldn't have access to](#unauthorized-path-access)
|
||||
1. [Setting bad file permissions](#bad-file-permissions)
|
||||
1. [Accessing unpriveleged files from privileged process](#principle-of-least-privilege)
|
||||
|
||||
Without properly handling these three primary issues, a process could leak information or provide a path for an attacker to alter system files.
|
||||
|
||||
## Unauthorized Path Access
|
||||
|
||||
### Manipulated Paths
|
||||
|
||||
Similar to SQL injection, a user can manipulate a path to attempt to access locations they shouldn't otherwise be able to access. The classic example is using the `..` notation to go up a directory level. Using multiple `../../../../` will eventually reach the root of the system, allowing a malicious user to access the entire system.
|
||||
|
||||
There are other more subtle ways to perform these types of operations as well. While Unix based systems are generally case sensitive, Windows and certain programs are not, thus `../../../PRIVATE` is the same as `../../../private`.
|
||||
|
||||
Some systems will also evaluate `.../......./` to `../../` bypassing some checks.
|
||||
|
||||
There are enough ways to perform directory traversal that it becomes difficult to filter all of them. Thus the solution is similar to SQL injection, don't filter. Simply sanitize the input. For SQL injection, that means escaping the input, for directory traversal, it's requesting the underlying system to give the absolute path.
|
||||
|
||||
By requesting the absolute path, all these tricks are flattened and returns a standard path. Then the program can verify it should be accessing that path.
|
||||
|
||||
## Bad File Permissions
|
||||
|
||||
On the surface this one is pretty simple. When creating a file, give it the most restrictive access possible for functionality to continue. By limiting access a malicious actor will have a harder time viewing and manipulating the data. And this should definitely be done, but there are some more subtle ways to keep things secure.
|
||||
|
||||
There are other file attributes that need to be considered. By checking and storing things like the inode number, link status, and device id, there is more assurance that this is the correct file and hasn't been replaced.
|
||||
|
||||
## Principle of Least Privilege
|
||||
|
||||
Keep the program running as an unprivileged user and only request more privileges when needed. This is good advice for any program, but comes in handy for file IO.
|
||||
|
||||
In this case, when accessing a globally accessible file (such as in `/tmp`) the program should NOT be running privileged. This is because a malicious actor could replace the `/tmp` file with a symbolic link to a privileged file. The program will happily open that link and read or modify the contents.
|
||||
|
||||
If the program is running unprivileged when accessing these unprivileged files, it would get a file system error. This will prevent the program from accessing files out of scope.
|
||||
|
||||
## Conclusion
|
||||
|
||||
There are a few takeaways from exploring issues with File I/O.
|
||||
|
||||
First (and this is true for everything) never trust user input. Anything that could be coming from the user should be verified, even if there are controls on the user interface. In context of File I/O, this means verifying file names and paths. Always expand the path, then verify that the path is valid. Un-expanded paths can have numerous ways to hide the actual path the OS will use.
|
||||
|
||||
Next, make sure any files created have the most restrictive permissions possible. This way only the process owner can read, write, and delete the file.
|
||||
|
||||
Finally, only run the process at the least amount of privileges, only escalating those privileges when absolutely necessary. This way if an attack attempts to trick the process to accessing privileged files, it will not be able to.
|
||||
|
||||
Using these three techniques should help stave off the worst of the File I/O issues.
|
||||
@@ -4,7 +4,7 @@ date: 2022-08-17
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Introduction
|
||||
## Introduction
|
||||
|
||||
Continuing the series of summarizing the themes in "Secure Coding in C and C++" by Robert C. Seacord, we will discuss freeing pointers. The title of this section is specifically about setting to `NULL` after calling free, but this post will cover a lot more than that. Here we will discuss the problems with forgetting to free, functions whose return value needs to be freed, and freeing without allocating/double free.
|
||||
|
||||
@@ -12,11 +12,11 @@ As for the title of this piece, some of the most common problems can be solved s
|
||||
|
||||
This is written for an audience that has a broad overview of security concepts. Not much time is spent explaining each concept, and I encourage everyone to read the book.
|
||||
|
||||
# Always `free` When Done
|
||||
## Always `free` When Done
|
||||
|
||||
First off lets discuss why `free` is important. Without freeing variables, best case scenario is you end up with leaked memory and worst case could introduce vulnerabilities.
|
||||
|
||||
## Memory Leaks
|
||||
### Memory Leaks
|
||||
|
||||
When non-pointer variables are declared, they are restricted to the scope in which they were created. The operating system will clear the memory at the end of the scope. For pointers however, allocated memory is not restricted by scope. So if a pointer is not cleared before the end of scope, that memory will still be held by the process. Depending on how large these allocations are, you could fill memory quite quickly. At best this will lead to crashing your own program (if the OS restricts memory), at worst you will crash the system.
|
||||
|
||||
@@ -24,13 +24,13 @@ One of the best ways to handle this is with `goto`'s. Yes, despite the hate for
|
||||
|
||||
Also by using the `goto` and anchor, it will prevent another possible vulnerability, use after free. This is also discussed in the next section.
|
||||
|
||||
## Vulnerabilities
|
||||
### Vulnerabilities
|
||||
|
||||
The other problem with forgetting to call free is allowing an attacker to gain access to your memory space, which could cause sensitive data to be leaked. By exploiting other vulnerabilities an attacker could gain access to memory that was supposed to be freed. Another problem with forgetting to free is denial of service attacks. An attacker can specifically target the memory leak to overload the system.
|
||||
|
||||
Another vulnerability isn't forgetting to free, but forgetting that you did free. Use after free can be a big issue. If an attacker can fill the memory space that was previously freed, when the program uses the pointer again, instead of erroring out the vulnerable program will use the new data. This could result in code execution, depending on how the memory is used.
|
||||
|
||||
# Knowing When to `free`
|
||||
## Knowing When to `free`
|
||||
|
||||
When you as the developer call `calloc`, `malloc` or almost anything else with an `alloc` it's pretty clear that those need to be freed. You declared the pointer and created the memory. But there are other situations that are not as clear, when calling functions that allocate memory.
|
||||
|
||||
@@ -38,11 +38,11 @@ These functions could either be built in functions like `strdup` or ones which y
|
||||
|
||||
This is a perfect situation for a `goto` to an anchor at the end of the method. Then there only needs to be a single `free` preventing use after free and double free. It also then requires only a single return. This will prevent returning before freeing, reducing the risk of memory leaks.
|
||||
|
||||
# Knowing When NOT to `free`
|
||||
## Knowing When NOT to `free`
|
||||
|
||||
Knowing when not to free is not as big of an issue as not freeing, but can still cause issues. Double frees, freeing before allocating, and freeing without allocating can cause your program to crash. It doesn't cause additional errors, but can make you vulnerable to denial of service.
|
||||
|
||||
# Conclusion
|
||||
## Conclusion
|
||||
|
||||
Freeing is vitally important to keeping your programs safe. All allocations need to be freed and it's best to free at the end of the method the pointer was allocated in. This will help prevent use after frees and forgetting to free. An anchor at the end of a method with `goto` is the best way to accomplish this.
|
||||
|
||||
|
||||
@@ -4,19 +4,19 @@ date: 2022-08-13
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Introduction
|
||||
## Introduction
|
||||
|
||||
Series on summarizing themes in "Secure Coding in C and C++" by Robert C. Seacord, part 2. Find part 1 here [Always null Terminate (Part 1)]({{<ref "secure-coding-in-c-summations-null-terminate.md">}}). We are currently going through this book in our work book club and there are a lot of good themes that seem to be threaded through the book. These are my notes, thoughts, and summaries on some of what I've read and our book club have discussed.
|
||||
Series on summarizing themes in "Secure Coding in C and C++" by Robert C. Seacord, part 2. Find part 1 here [Always null Terminate (Part 1)](/posts/secure-coding-in-c-summations-null-terminate). We are currently going through this book in our work book club and there are a lot of good themes that seem to be threaded through the book. These are my notes, thoughts, and summaries on some of what I've read and our book club have discussed.
|
||||
|
||||
This is written for an audience that has a broad overview of security concepts. Not much time is spent explaining each concept, and I encourage everyone to read the book.
|
||||
|
||||
The first theme to discuss is always `null` terminating `char *` or `char array` buffers (unless you have a *very* specific reason for not). This is very important to help prevent buffer overflows, reading arbitrary memory, accessing 'inaccessible' memory. This is part 2 where we will discuss string cat and length. For a brief discussion on string copy see [part 1]({{<ref "secure-coding-in-c-summations-null-terminate.md">}}).
|
||||
The first theme to discuss is always `null` terminating `char *` or `char array` buffers (unless you have a *very* specific reason for not). This is very important to help prevent buffer overflows, reading arbitrary memory, accessing 'inaccessible' memory. This is part 2 where we will discuss string cat and length. For a brief discussion on string copy see [part 1](/posts/secure-coding-in-c-summations-null-terminate.md).
|
||||
|
||||
# Functions Needing null
|
||||
## Functions Needing null
|
||||
|
||||
One of the important reasons to `null` terminate is there are several very common functions that require `null` termination. Even some that you wouldn't necessarily think of. Without having `null` at the end of the buffer, it creates a situation where things could go wrong.
|
||||
|
||||
## String Cat
|
||||
### String Cat
|
||||
|
||||
The next set of functions to look at are concatenating strings. These not only need to be `null` terminated, but they also need to be properly allocated. If they are not a concatenation could overwrite `null` terminators, and the resulting string could cause errors further in the code. Memory allocation will be discussed further in another post. First I'm going to throw a table at you, it gives a summary of string concat functions and how they handle some of the issues. We will discuss further after the table.
|
||||
|
||||
@@ -29,9 +29,9 @@ The next set of functions to look at are concatenating strings. These not only n
|
||||
|
||||
Lets go over each function:
|
||||
|
||||
### strcat
|
||||
#### strcat
|
||||
|
||||
```
|
||||
```c
|
||||
char *strcat(char *dest, char *src)
|
||||
```
|
||||
|
||||
@@ -46,9 +46,9 @@ Arbitrary memory reads can be a problem since it could mean revealing data meant
|
||||
|
||||
Be sure to set the last character to `null` after the `strcat` is completed.
|
||||
|
||||
### strncat
|
||||
#### strncat
|
||||
|
||||
```
|
||||
```c
|
||||
strncat(char *dest, char *src, size_t src_len)
|
||||
```
|
||||
|
||||
@@ -59,9 +59,9 @@ In addition if `src` is not `null` terminated and `src_len` is longer than the l
|
||||
`strncat` helps the developer watch for these issues but doesn't actually solve them.
|
||||
|
||||
|
||||
### strlcat
|
||||
#### strlcat
|
||||
|
||||
```
|
||||
```c
|
||||
size_t strlcat(char *dst, const char *src, size_t size)
|
||||
```
|
||||
|
||||
@@ -74,13 +74,13 @@ Point one is great so you don't need to worry as much about pre setting the memo
|
||||
|
||||
Point two is good so you can compare `size` to the return value to see if the source was truncated.
|
||||
|
||||
## Sensing a Theme
|
||||
### Sensing a Theme
|
||||
|
||||
There are two themes for string concatenating, one is **`null` terminate all character buffers**, the second is proper memory allocation. This will be discussed in a future post.
|
||||
|
||||
Every one of these functions require the source and destination to be `null` terminated. If they are not, or if there is a `null` in the middle, it will cause issues!
|
||||
|
||||
# Conclusion
|
||||
## Conclusion
|
||||
|
||||
`null` termination is important so that we don't accidentally read or write to arbitrary memory. This concludes the discussion on `null` termination, the next post will cover proper memory allocation.
|
||||
|
||||
|
||||
@@ -4,7 +4,7 @@ date: 2021-09-01
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Introduction
|
||||
## Introduction
|
||||
|
||||
Welcome to the next series, summarizing themes in "Secure Coding in C and C++" by Robert C. Seacord. We are currently going through this book in our work book club and there are a lot of good themes that seem to be threaded through the book. These are my notes, thoughts, and summaries on some of what I've read and our book club have discussed.
|
||||
|
||||
@@ -12,11 +12,11 @@ This is written for an audience that has a broad overview of security concepts.
|
||||
|
||||
The first theme to discuss is always `null` terminating `char *` or `char array` buffers (unless you have a *very* specific reason for not). This is very important to help prevent buffer overflows, reading arbitrary memory, accessing 'inaccessible' memory.
|
||||
|
||||
# Functions Needing null
|
||||
## Functions Needing null
|
||||
|
||||
One of the important reasons to `null` terminate is there are several very common functions that require `null` termination. Even some that you wouldn't necessarily think of. Without having `null` at the end of the buffer, it creates a situation where things could go wrong.
|
||||
|
||||
## String Copy
|
||||
### String Copy
|
||||
|
||||
The first set of functions to look at are copying strings. These not only need to be `null` terminated, but they also need to be properly allocated. Memory allocation will be discussed further in another post. First I'm going to throw a table at you, it gives a summary of string copy functions and how they handle some of the issues. We will discuss further after the table.
|
||||
|
||||
@@ -29,9 +29,9 @@ The first set of functions to look at are copying strings. These not only need t
|
||||
|
||||
Lets go over each function:
|
||||
|
||||
### strcpy
|
||||
#### strcpy
|
||||
|
||||
```
|
||||
```c
|
||||
strcpy(char *dest, char *src)
|
||||
```
|
||||
|
||||
@@ -44,9 +44,9 @@ This function is super basic and needs a lot of careful programming. The destina
|
||||
|
||||
Arbitrary memory reads can be a problem since it could mean revealing data meant to be secret. Depending on where memory is allocated, sensitive data could be revealed to the user.
|
||||
|
||||
### strncpy
|
||||
#### strncpy
|
||||
|
||||
```
|
||||
```c
|
||||
strncpy(char *dest, char *src, size_t dest_len)
|
||||
```
|
||||
|
||||
@@ -58,9 +58,9 @@ The only thing it does is *helps* with buffer overflows. However, if the `dest_l
|
||||
|
||||
So `strncpy` can still read arbitrary memory and can still buffer overflow (tho overflows are more difficult).
|
||||
|
||||
### strlcpy
|
||||
#### strlcpy
|
||||
|
||||
```
|
||||
```c
|
||||
size_t strlcpy(char *dst, const char *src, size_t size)
|
||||
```
|
||||
|
||||
@@ -73,9 +73,9 @@ Point one is great so you don't need to worry as much about pre setting the memo
|
||||
|
||||
Point two is good so you can compare `size` to the return value to see if the source was truncated.
|
||||
|
||||
### strdup
|
||||
#### strdup
|
||||
|
||||
```
|
||||
```c
|
||||
char *strdup(const char *s);
|
||||
```
|
||||
|
||||
@@ -85,12 +85,12 @@ The only thing to note is that it reads until the `null` terminator.
|
||||
|
||||
One important thing to note, the returned value must be `free`'d
|
||||
|
||||
## Sensing a Theme
|
||||
### Sensing a Theme
|
||||
|
||||
See the theme yet ... **`null` terminate all character buffers**
|
||||
|
||||
Every one of these functions require the source to be `null` terminated. If they are not, or if there is a `null` in the middle, it will cause issues!
|
||||
|
||||
# Conclusion
|
||||
## Conclusion
|
||||
|
||||
`null` terminating is very important to prevent accessing or writing to memory locations that should not be accessed. In this post we discussed copying strings. In the next post, we will continue this theme with concatenating strings.
|
||||
|
||||
@@ -4,7 +4,7 @@ date: 2019-08-23
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Introduction
|
||||
## Introduction
|
||||
|
||||
In order to allow flexibility in deployment location and to preserve user privacy we have performed research into stateless classification of network traffic. Because traffic does not always follow the same path through a network, by not worrying about state, we can deploy anywhere. We also use only one direction of traffic as replies could also follow a different path through the network. And by not requiring data within the packet, we can perform analysis on encrypted traffic as well.
|
||||
|
||||
@@ -12,7 +12,7 @@ Our research shows that it is possible to determine if traffic is malicious by u
|
||||
|
||||
This post serves as an introduction to my master's thesis of the same title. [Full paper for those interested.](/security/StatelessDetectionOfMaliciousTraffic.pdf)
|
||||
|
||||
# What Was Done
|
||||
## What Was Done
|
||||
|
||||
The system we developed for this research was an intrusion detection system (IDS), thus does not block any traffic. Most IDS's use specific signatures for traffic. These are inflexible and will only detect the specific attack. If the traffic is modified in any way, it will no longer be detected. Instead of signatures, our system looks at ongoing traffic patterns.
|
||||
|
||||
@@ -22,7 +22,7 @@ Our system differs since it uses patterns. Because of this, we cannot say for ce
|
||||
|
||||
We used three primary data points to determine if traffic was malicious: destination port, TTL, and packet frequency. To actually perform the classification, we used a software package called WEKA (an open source trainable algorithm) and focused on bayesnet classification.
|
||||
|
||||
# Conclusions
|
||||
## Conclusions
|
||||
|
||||
While performing the research, we observed that port only usage provided the least confidence. This isn't surprising, since it will only be useful for network scans. Packet frequency proved to be a better data point for classification. It appeared that benign traffic had a burst at the beginning, with fairly regular communication for the rest of a session. Malicious traffic would have a large burst of traffic followed by nothing, or very little traffic. TTL proved to be one of the best signatures. This is due to the fact that most benign traffic is to a few locations, which are usually physically close. TTL for malicious traffic is usually smaller, either due to further physical locations, as part of the attack, or for the attacker to gain further information about the victim network.
|
||||
|
||||
|
||||
Submodule themes/hugo-theme-terminal updated: 007d7f3df6...641c5a27ac
Reference in New Issue
Block a user