📔
Cyber Security Notes
  • Introduction
  • CVEs
    • CVE-2022-33106
  • Paper Reviews
    • Imperfect Forward Secrecy: How Diffie-Hellman Fails in Practice
  • Security Basics Notes
    • Identification, Authentication and Authorization
  • Enumeration and Initial Compromise
    • Methodology
    • Footprinting
    • Network Protocols
      • FTP
      • SMB
      • DNS
      • NFS
      • SMTP
      • IMAP/POP3
      • SNMP
      • MySQL
      • MSSQL
      • Oracle TNS
      • IPMI
    • Nifty One Liners
    • Brute-Force Web Pages
      • Hydra
    • Network Pentest
      • Quick SMB cheatsheet
      • SSH keypair basics
      • Compromise using SSH Key
      • Networking fundamentals Interview topics
      • nmap quick cheatsheet
      • Metasploit Quick Reference
    • Web Pentest
      • Web Pentest Interview top topics
      • Wordpress Exploitation
      • Joomla Exploitation
      • Login Bypass using Cookie Tampering/Poisoning
      • Subdomain Enumeration
      • CSRF mitigation
      • XSS mitigation
      • CSP bypass with JSONP
      • PHP Vulnerabilities
      • Python Serialization Vulnerabilities - Pickle
      • SQL Injections
        • SQLmap
      • SSTI
      • XSS
    • Buffer Overflow Prep
      • Understanding CPUs
      • Virtual Memory and Paging
      • Syscalls
      • Theorem Proving
      • Stripping readable function names
      • Insecure C functions
      • Stack Canaries
      • Linking - GOT,PLT
      • Return Oriented Programming
    • Active Directory - Basics
      • AD DS
      • Managing OUs
      • Group Policies
      • Authentications
      • Trees, Forests and Trusts
      • Kerberos
      • Attacking Kerberos
      • Priv Esc (Post Exploitation)
    • DNS/Domain Enum Masterguide
  • Post Exploitation
    • Shell Escape Techniques
    • Getting stable shell after compromise
    • Linux Privilege Escalation
      • Sudoers file
      • Sudoers entry - Yum
      • Wildcards - Basics
      • Wildcards - Chown
      • Wildcards - Tar
      • Linux Permissions & SUID/SGID/Sticky Bit
      • SUID - nmap
      • SUID - bash
      • SUID - man
      • NFS no_root_squash
      • SUID - pkexec
      • Bad permissions
    • Windows Privilege Escalation
      • SeImpersonatePrivilege Token Impersonation
      • Firefox Creds
      • Potatoes
      • Print Spooler Basics
      • Print Spooler CVE 2020-1030
      • SpoolFool
    • Data Exfiltration Post Exploitation
  • Port Forwarding Cheatsheet
  • Powershell Essentials
    • Powershell Basics
    • Powershell Enumeration
    • Powershell Port Scanner
    • Powershell One Liner Port Scanning
    • Powershell Port Scan in a given CIDR
  • Application Security
    • System Calls in Linux
    • Buffer Overflow Defenses
    • Format string vulnerabilities
    • Sample Github Actions
    • Basic Bugs in Demo Application
    • Using AFL++
  • Linux 64-bit Assembly
    • GDB Basics
      • My relevant GDB cheatsheet
      • Task 1 - Tamper strcmp logic
      • Breakpoints
      • Always starting with intel flavor
      • GDB TUI Mode
    • Basic Hello World Program
    • Registers in 64-bit
    • global directive
    • Reducing instructions and Removing NULL-> Optimizing memory in Assembly
    • Data Types
    • Endianness
    • Moving Data
    • push, pop, and the stack
    • Analysis - Writing data on memory location and referencing
    • Arithmetic Operations
    • Bitwise Logical Operations
    • Bit-Shifting Operations
    • Control Instructions
    • Loops
    • Procedures
    • Stack-Frames and Procedures
    • String Operations
    • Shellcoding basics
      • Introduction and Common Rules
      • Basic Shellcodes->Exit
      • Testing shellcode->Skeleton Code
      • Techniques-> JMP,CALL,POP
      • Techniques-> Stack
      • Techniques-> (64-bit only) RIP Relative Addressing
      • Shellcode 1 -> execve(/bin/sh) STACK PUSH
      • Shellcode 1 -> execve(/bin/sh) JMP CALL POP
      • Techniques-> XOR-Encoder
  • Cloud Security
    • Foundational Technology
    • Learning Through Project Omega
    • IAM Essentials
      • Deep dive into IAM - Part 1
    • Amazon S3
    • Risk Management & Data Controls
    • Enumeration
      • S3 - Enum Basics - PwnedLabs
      • S3 - Identify the AWS Account ID from a Public S3 Bucket
      • EBS - Loot Public EBS Volumes
      • S3- Exploit Weak Bucket Policies for Privileged Access
  • API Security
    • WSDL
  • Reverse Engineering
    • Some string Operations
    • Numbers and Inputs
    • Address inputs
    • Recursive Function
    • Crackme: level1
    • Crackme: level2
    • CTF: Memory Dereferencing
    • CTF: Monty Python
  • CTF Challenge Learnings
    • vsCTF 2024
      • Sanity Check
      • not-quite-caesar
      • Intro to reversing
    • NCL Individual 2024
      • Web Challenges
        • PiratePals
        • Pierre's Store
    • Pico CTF 2024
      • Web Exploitation
        • Bookmarklet
        • WebDecode
        • Unminify
        • Trickster
      • General Skills
        • Commitment Issues
        • Time Machine
        • Blame Game
        • Collaborative Development
        • Binary Search
        • Dont-you-love-banners
    • Sunshine CTF
      • Knowledge Repository
    • Amazon WiCys CTF
      • I am Lazy
      • Password Locker on the Web
      • Happy Birthday Card Generator
      • Bloggergate
      • simple offer
      • Bad Actor
      • Secret Server
      • Simple PCAP
      • Hidden Message
    • C code using getenv()
    • Command Injection with filter
    • Pwning
      • Shoddy_CMP
      • PLT_PlayIT
  • Applied Cryptography
    • Linear Congruential Generator
  • Tools for everything
Powered by GitBook
On this page

Was this helpful?

  1. CTF Challenge Learnings
  2. Sunshine CTF

Knowledge Repository

Skills: Programming, cryptography, GitHub, steganography, forensics

PreviousSunshine CTFNextAmazon WiCys CTF

Last updated 1 year ago

Was this helpful?

Participation page:

Event weight on ctftime: 34

START: 2023-10-07 14:00 UTC

FINISH: 2023-10-09 14:00 UTC

Flag Format: sun{stuffs}

Category: Miscellaneous

Points system: Dynamic reducing based on number of solves

Points collected: 396

It is just a plain text file where an E-mail conversation is stored. The contents of the E-mail were base64 encoded.

Upon inspecting the file, I saw that there was an attachment too.

After decoding the message contents, I saw this message:

I used an online EML viewer to directly download the attachment. This was a Git bundle.

I then created a new git repo and verified the bundle

I cloned the repository then.

Upon visiting the recently cloned repository, I saw an audio file.

Upon listening to this audio file, it felt like a simple frequency varying beeps file. The beeps were quickly changing and had only 2 distinguishable tones. This indicated it was probably a morse code.

Upon analyzing this in dcode.fr, I observed that it is NATO encoded text.

Further, upon deciphering it, I noticed that it was english for "="

Since, this is not at all a flag, there has to be more to it. Further, it can be speculated that "=" is the end of a base64 or base32 encoded string.

I examined the github repo a bit more and found out that a total of 3000+ commits were there.

git rev-list --count --all

Upon inspecting git log and a few of the commits (reverting repo to the commit), I observed that in each commit, data file was of different size. So, the data file must have different character/word which would be later combined together to give us the next clue.

Now upon checking different commits, I see different data files. I inspect a few data files and turns out each one of then had a single character (Alfa, Bravo, Delta etc.)

All of these files are encoded in NATO phonetic encoding and it goes like A,B,C for Alfa, Bravo, Charlie etc.

To extract the message, we might need to combine all of these together. Now, doing this for 3016 commits is cumbersome. So, a programmatic approach was essential to save some time.

But since the script was only producing output related to the commits name, I had to modify the script so that it is numbered (in order to view from first to last commit)

Script used was:

#!/bin/bash -e

function getRepoSHA1List {
        pushd "$1" >/dev/null
        git rev-list --all
        popd >/dev/null
}

function exportCommits {
        local commitSHA1
        local count=1

        local IFS=$'\n'
        for commitSHA1 in $(getRepoSHA1List "$1"); do
                # build export directory for commit and create
                local exportDir="${2%%/}/$(printf "%04d" $count)"
                echo "Export $commitSHA1 -> $exportDir"
                mkdir --parents "$exportDir"

                # create archive from commit then unpack to export directory
                git \
                        --git-dir "$1/.git" \
                        archive \
                        --format tar \
                        "$commitSHA1" | \
                                tar \
                                        --directory "$exportDir" \
                                        --extract
                count=$((count + 1))
        done
}


# verify arguments
if [[ (! -d $1) || (! -d $2) ]]; then
        echo "Usage: $(basename "$0") GIT_DIR OUTPUT_DIR"
        exit 1
fi

if [[ ! -d "$1/.git" ]]; then
        echo "Error: it seems [$1] is not a Git repository?" >&2
        exit 1
fi

exportCommits "$1" "$2"

Upon running the script we see the data was being renamed as per the numbers

Now, upon inspecting the data files in these folders, I see some of them had the same sizees. For example, files with size 4444 corresponded to the alphabet "6"

Now, an algorithm is needed to rename these files based on the size to alphanumeric conversion. Since there are 26 english alphabets and 10 numerals, I need to classify 36 files of different sizes and put them in a dictionary for conversion. I wrote a simple Python script to rename size 4444 to 6 first and tested it out. Before that, I ran simple CLI commands to ensure how many files of size 4444 are there.

Putting this through wc would give me how many files have 4444 size.

I wrote this small script to replace files with size 4444 with the digit 6. The dictionary "size_to_code" can be extended for other files as well. But for testing I just tried to replace all the files with 4444 sizee first.

import os

size_to_code = {
    4444: "6",
}

def rename_files_in_directory(directory):
    count = 0
    for foldername, subfolders, filenames in os.walk(directory):
        for filename in filenames:
            filepath = os.path.join(foldername, filename)
            filesize = os.path.getsize(filepath)

            if filesize in size_to_code:
                new_filename = size_to_code[filesize] + os.path.splitext(filename)[1]
                new_filepath = os.path.join(foldername, new_filename)
                os.rename(filepath, new_filepath)
                count += 1
                print(f'{count}. Renamed {filename} to {new_filename}')

output_directory = '/home/kali/ctf/sunshine/knowledge_repo/_the_ai_repository_of_knowledge/output'
rename_files_in_directory(output_directory)

I saw that a total 94 files were renamed. I confirmed this earlier in the CLI too. I then grepped and saw that all the files were renamed properly.

So the algorithm to solve and extract the message is as follows:

  1. Classify 36 files (decode audio morse) manually and based on their sizes, add them to the dictionary "size_to_code" in Python script. To obtain the directory names of these 36 different files, write a simple bash script.

  2. Rename the files from NATO phonetic to their respective plaintext conversion.

  3. Open folders one by one using Python and read the file name. Keep appending the filename in a text file.

  4. Reverse the obtained string.

  5. Decode base64 to plaintext and save in a file.

  6. Extract the message.

I wrote a small script to find out the folders with data files of similar sizes:

import os
from collections import defaultdict

def find_files_with_same_size(directory):
    size_to_directories = defaultdict(list)
    for foldername, subfolders, filenames in os.walk(directory):
        for filename in filenames:
            if filename != "data":
                continue
            filepath = os.path.join(foldername, filename)
            filesize = os.path.getsize(filepath)
            folder = os.path.basename(foldername)
            size_to_directories[filesize].append(folder)

    with open('sizes_and_files.txt', 'w') as f:
        for size, folders in size_to_directories.items():
            if len(folders) > 1:
                f.write(f"Size: {size}\nFiles: {', '.join(folders)}\n\n")

output_directory = '/home/kali/ctf/sunshine/knowledge_repo/_the_ai_repository_of_knowledge/output'
find_files_with_same_size(output_directory)

The script classifies a size then tells which folders have "data" file of the same sizes.

This shortened my looking up time in audio files. Now I can manually go ahead and decode them, then add the decoded value in my solution script's dictionary.

Also, since the audio files showed no case sensitivity, it is safe to assume the message could also be base32 where all letters are capital. The final dictionary becomes:

size_to_code = {
    4444: "6",
    11549: "E",
    35098: "F",
    14364: "2",
    18804: "A",
    22464: "N",
    10004: "3",
    32442: "Q",
    24124: "D",
    16614: "V",
    17094: "L",
    12439: "G",
    66378: "Y",
    76892: "I",
    50701: "Z",
    4986: "7",
    394044: "K",
    19998: "X",
    26684: "P",
    98164: "H",
    31644: "W",
    29616: "C",
    73508: "O",
    120294: "J",
    10404: "4",
    6036: "5",
    14576: "T",
    12830: "S",
    13006: "M",
    17643: "R",
    34298: "U",
    278894: "=",
}

Upon running the modified solution script, we can see that whole 3016 files were renamed to their respective mappings.

Finally, a script was created to read the audio file names in respective folder, append those filenames in a file called output.txt

import os

def read_filenames_and_append_to_file(directory, output_file):
    characters = []
    for folder in sorted(os.listdir(directory)):
        folder_path = os.path.join(directory, folder)
        if not os.path.isdir(folder_path):
            continue
        files = os.listdir(folder_path)
        if files:
            characters.append(files[0])

    with open(output_file, 'w') as f:
        f.write(''.join(characters))

output_directory = '/home/kali/ctf/sunshine/knowledge_repo/_the_ai_repository_of_knowledge/output'
output_txt = '/home/kali/ctf/sunshine/knowledge_repo/_the_ai_repository_of_knowledge/output.txt'

read_filenames_and_append_to_file(output_directory, output_txt)

The script upon running gave an output.txt file. Since it starts with === sign, this gives me an idea that the base32 string is reversed.

So, to make it a proper base32 string, I then finally reversed this string and saved the output in a file called final.

rev output.txt | base32 -d > final
file final

This turned out to be a gz file. Then finally, I used gunzip to extract a text file out of this archive and obtained the flag!

Alternate: One can also use python based decoders like morse2ascii and wav2ascii to do this. I tried it using that but a lot of the small length audio files were not being decoded correctly. Also, the same algorithm I devised can be done using md5 checksums of the files.

The challenge gives us a zip file. I unzipped it and saw an EML file in there. Since I had never encountered an EML in a CTF file before, I had to read up about it.

Upon deciphering it on an online tool , I noticed some text come out. This initially didn't make sense.

I found a script online that would extract all the files in a repository here:

https://www.adobe.com/uk/acrobat/resources/document-files/text-files/eml.html
https://morsecode.world/international/decoder/audio-decoder-adaptive.html
https://gist.github.com/magnetikonline/5faab765cf0775ea70cd2aa38bd70432
https://sunshinectf.org/
272KB
data.wav