Beginners Guide to OSINT – Chapter 1

DISCLAIMER: I’m not an Open Source Intelligence (OSINT) professional (not even close). I’ve done some courses (got a cert), written some code and spend far too much using Maltego. OSINT is a subject I enjoy, it’s like doing a jigsaw puzzle with most of the pieces missing. This blog series is MY interpretation of how I do (and view) OSINT related things. You are more than welcome to disagree or ignore what I say.

The first chapter in the OSINT journey is going to cover the subject of “What is OSINT and what can we use it for”, sorry it’s the non technical one but I promise not to make it too long or boring.

What is OSINT??

OSINT is defined by wikipedia as:

“Open-source intelligence (OSINT) is intelligence collected from publicly available sources. In the intelligence community (IC), the term “open” refers to overt, publicly available sources (as opposed to covert or clandestine sources); it is not related to open-source software or public intelligence.” (source)

For the purpose of this blog series we are going to be talking about OSINT from online sources, so basically anything you can find on the internet. The key point for me about OSINT is that it (in my opinion) only relates to information you can find for free. Having to pay to get access to information such as an API or raw data isn’t really OSINT material. You are essentially paying someone else to collect the data (that’s the OSINT part) and then just accessing their data. I’m not saying that’s wrong or should be a reason not to use data from paid sources, it’s just (and again just my opinion) not really OSINT in its truest form.

Pitfalls of OSINT

Before we go any further I just wanted to clarify something about collecting data via OSINT. This is something that I often talk to people about and there are varying different opinions about it. When you collect some data via OSINT methods it’s important to remember that the data is only as good as the source you collect it from. The simple rule is “Don’t trust the source, don’t use it”.

You also need to consider about the way that the data is collected. Let me explain a bit more, consider this scenario (totally made up).

You spot someone (within a corporate environment) called Ronny emailing a file called “secretinformation.docx” to an external email address of ronnythespy@madeupemail.com. You decide to do some “OSINT” to work out if the two Ronnies are the same people. Using a tool or chunk of code (in a language you don’t know) you decide that you have enough information to link the two Ronnies together.

Corporate Ronny takes you to court to claim unfair dismissal, during the court procedures you are asked (as the expert witness) how the information was collected. Now you can explain the process you followed (run code or click on the tool) but can you explain how the tool or chunk of code provided you with that information or the methods it used to collect it (where they lawful for example)?

For me, that’s the biggest consideration when using OSINT material if you want to use it to provide true value to what you are trying to accomplish. Being able to collect the information is one thing, validating the methods or techniques on how it was collected is another. Again this is a conversation I have many, many times and I work on this simple principle, “if in doubt, create it yourself” which basically means I have to/get to write some code or build a tool.

This quote essentially sums up everything I just said, “In OSINT, the chief difficulty is in identifying relevant, reliable sources from the vast amount of publicly available information.” (source)

What is OSINT good for?

Absolutely everything!! Well ok nearly everything, but there are a lot of ways that OSINT can be used for fun or within your current job. Here are some examples;

  • Company Due Diligence
  • Recruitment
  • Threat Intelligence
  • Fraud & Theft
  • Marketing
  • Missing Persons

What are we going to cover??

At the moment I’ve got a few topics in mind to cover in this blog series, I am open to suggestions or ideas so if you have anything let me know and I will see what I can do. Here are the topics I’ve come up with so far (which is subject to change).

  • Image Searching
  • Social Media
  • Internet Infrastructure
  • Companies
  • Websites

Hopefully you found this blog post of use (or interesting), leave a comment if you want me to cover another subject or have any questions/queries/concerns/complaints.

Building your own Whois API Server

So it’s been a while since I’ve blogged anything not because I haven’t been busy (I’ve actually been really busy), but more because a lot of the things I work on now I can’t share (sorry). However every now and again I end up coding something useful (well I think it is) that I can share.

I’ve been looking at Domain Squatting recently and needed a way to codify whois lookups for domains. There are loads of APIs out there but you have to pay and I didn’t want to, so I wrote my own.

It’s a lightweight Flask application that accepts a domain, does a whois lookup and then returns a nice JSON response. Nothing fancy, but it will run quite happily on a low spec AWS instance or on a server in your internal environment. I built it to get around having to fudge whois on a Windows server (lets not go there).

In order to run the Flask application you need the following Python libraries (everything else is standard Python libraries).

  • Flask
  • pythonwhois (this is needed for the pwhois command that is used in the code)

To run the server just download the code (link at the bottom of page) and then run.

python whois-server.py

The server runs on port 9119 (you can change this) and you can submit a query like this:

http://localhost:9119/query/?domain=example.com

You will get a response like the picture below:

From here you can either roll it into your own tool set or just it for fun (not sure what sort of fun you are into but..).

You can find the code in my GitHub repo MyJunk or if you just want this code it’s here.

There may well be some bugs but I haven’t found any yet, it runs best on Linux (or Mac OSX).

Any questions, queries etc. etc. you know where to find me.

Pandora – Maltego Graph Thingy

I talk to a lot of different people about Maltego, whether its financial institutions, law enforcement, security professionals or plain old stalkers (only kidding) and the question I usually end up asking them is this;

What do you want to do with the data once it’s in Maltego?

The reporting features in Maltego are good, but sometimes you want something a little bit “different”, usually because you want to add the data you collect in Maltego to another tool you may have, or you want to share the information with others (who don’t have Maltego) or just because you want to spin you own reports.

A few weeks ago on my OSINT course we talked about classifying open source intelligence against the National Intelligence Model (5x5x5), so I decided to see if I could write a tool that would take a Maltego graph and do just that. In addition (more as a by-product) you can now export Maltego graphs (including link information) to a JSON file.

I would like to thank Nadeem Douba (@ndouba) for the inspiration (and some of the code) which is part of the Canari Framework and originally allowed you to export a Maltego Graph to CSV format.

Pandora is a simple lightweight tool that has two main uses. The first is a simple command interface (pandora.py) that will allow you to specify and Maltego Graph and it just spits out a JSON file.

The second usage has the same functionality but via a simple web interface (webserver.py), you can export your Maltego graph to JSON and then get a table based view of the entities held within, you can also click on a link that shows all the outgoing and incoming linked entity types.

This is still a BETA at the moment, the JSON stuff works but the web interface has a few quirks to it. Over the next few weeks I will be adding extra stuff like reporting, the ability to send the JSON to ElasticSearch, Splunk (which has a new HTTP listener available) and some other cool stuff.

You can find Pandora HERE and some screenshots are below:

Maltego Graph (example)

MaltegoGraphExample

Pandora – Web Interface

PandoraWebInterface

Pandora – Web interface with imported graph

PandoraWebInterfaceWithGRaph

Pandora – Graph Information

PandoraUploadedJSON

Pandora – Link Information

LinkInformation

As always any questions, issues etc etc please let me know.

Open Source Cyber Intelligence – Course Review

DISCLAIMER: This review is based on my own experience attending the course, and is no way affiliated with the training provider or my current employer. All opinions stated in the review are my personal views and should be treated as such.

A couple of weeks ago (might be more) I spent the week in London on a training course (well actually it was two but..). The courses were run by QA and are part of their Cyber security range. Details of the courses are below (just in case you want to go on them).

Open Source Cyber Intelligence – Introduction (3 days)

Open Source Cyber Intelligence – Advanced (2 days)

Now it’s important to note at this point that the courses are focused on “Cyber” based Open Source intelligence techniques rather than the more generic Open Source Intelligence which in my mind is more about stalking people, sorry I mean tracking people.

Below is a brief outline of what the courses contained (taken from the QA website).

Open Source Cyber Intelligence – Introduction

Introduction
Module 1 – History of the Internet and the World Wide Web
Module 2 – How devices communicate
Module 3 – Internet Infrastructure
Module 4 – Search Engines
Module 5 – Companies and people
Module 6 – Analysing the code
Module 7 – The Deep Web
Module 8 – Social Media
Module 9 – Protecting your digital footprint
Module 10 – Internet Communities and Culture
Module 11 – Cyber Threat
Module 12 – Tools for investigators
Module 13 – Legislation

Open Source Cyber Intelligence – Advanced

Module 1 – Advanced search and Google hacking
Module 2 – Mobile devices; threats and opportunities
Module 3 – Protecting your online footprint and spoofing
Module 4 – Advanced software
Module 5 – Hacking forums and dumping websites
Module 6 – Encryption and anonymity tools
Module 7 – Tor, Dark Web and Tor Hidden Services (THS)
Module 8 – Bitcoin and Virtual Currencies
Module 9 – Other Dark Webs and Darknets
Module 10 – Advanced evidential capture

NOTE: The courses are designed for people of any skill level which is why when you look at some of the module titles and think “Why are they teaching networking basics” it may seem a little bit random.

It’s also important to point out that the prerequisite for the advanced course is that you have completed the introduction course first.

Course Review:

The size of the class (for both courses) was smaller than I expected but that wasn’t a bad thing as it gave us the chance to ask questions and provide a steer on the direction of the conversations without feeling like we were stopping lots of people from learning (and getting their monies worth).

You get a preconfigured workstation that has all the tools you need for the course as well as a Kali virtual machine, multiple browsers, plugins etc. and the workstation has plenty of grunt (CPU & Memory) to not become bogged down when running lots of things.

The instructor for both courses was a guy called Max Vetter who has loads of experience in this area and made sure we understood the content of the course and that it was also fun (check out “If Google was a Guy” on Youtube).

Now I’m no OSINT expert, but I have worked in IT for nearly 20 years and I am a bit of a OSINT wannabe so for me the introduction course was a bit slow. Don’t get me wrong, the content and the method it was delivered was awesome, but if you know about networking and how to do whois lookups or view source code in websites, you may find the introduction course not to your liking.

If however you know how to do all of the above, but have never done it within a OSINT type scenario then the course will be really useful (and fun) as it will enable you to understand how to use the information you collect in order to track and trace “cyber bad guys”.

For example if you find a “bad” domain, you can query whois to find out who registered it, then using a bit of Google-fu (Google hacking is covered in the introduction course) see if you can use the details from the whois information to find any other domains the suspect might have registered.

Let me show you, looking at the whois information for qa.com (the training provider) you will see the following:

For more information on Whois status codes, please visit
https://www.icann.org/resources/pages/epp-status-codes-2014-06-16-en.
Domain Name: qa.com
Registry Domain ID: 113160_DOMAIN_COM-VRSN
Registrar WHOIS Server: whois.1and1.com
Registrar URL: http://1and1.com
Updated Date: 2013-05-24T22:04:57Z
Creation Date: 1994-10-25T04:00:00Z
Registrar Registration Expiration Date: 2015-10-24T04:00:00Z
Registrar: 1&1 Internet AG
Registrar IANA ID: 83
Registrar Abuse Contact Email: abuse@1and1.com
Registrar Abuse Contact Phone: +1.8774612631
Reseller:
Domain Status: clientTransferProhibited https://www.icann.org/epp#clientTransferProhibited
Registry Registrant ID:
Registrant Name: Alexandra Kubicka
Registrant Organization: QA-IQ Ltd
Registrant Street: 80 Cannon Street
Registrant Street: 4th Floor
Registrant City: London
Registrant State/Province: ABE
Registrant Postal Code: EC4N 6HL
Registrant Country: GB
Registrant Phone: +44.8450559501
Registrant Phone Ext:
Registrant Fax: +44.8450559502
Registrant Fax Ext:
Registrant Email: IT@QA.com
Registry Admin ID:
Admin Name: Alexandra Kubicka
Admin Organization: QA-IQ Ltd
Admin Street: 80 Cannon Street
Admin Street: 4th Floor
Admin City: London
Admin State/Province: ABE
Admin Postal Code: EC4N 6HL
Admin Country: GB
Admin Phone: +44.8450559501
Admin Phone Ext:
Admin Fax: +44.8450559502
Admin Fax Ext:
Admin Email: IT@QA.com
Registry Tech ID:
Tech Name: Hostmaster ONEANDONE
Tech Organization: 1&1 Internet Ltd.
Tech Street: 10-14 Bath Road
Tech Street: Aquasulis House
Tech City: Slough
Tech State/Province: BRK
Tech Postal Code: SL1 3SA
Tech Country: GB
Tech Phone: +44.8716412121
Tech Phone Ext:
Tech Fax: +49.72191374215
Tech Fax Ext:
Tech Email: hostmaster@1and1.co.uk
Nameserver: ns1.dnsexit.com
Nameserver: ns2.dnsexit.com
Nameserver: ns3.dnsexit.com
Nameserver: ns4.dnsexit.com
DNSSEC: Unsigned

Then using one of the values from the whois (telephone number, email address, contact name(s)) you can get Google to find other domains that have the same whois information.

Within the Google search bar enter the following query:

site:whois.domaintools.com +44.8450559501

You will get back all the registered domains (stored in whois.domaintools.com) that have that phone number in the whois information. If like me you are a Python addict (it’s ok to admit it) you can automate this kind of process and write the information to something (flat file, database etc etc) but that’s not covered in the course.

The last two days of training was the fun stuff, the advanced course covered all the good stuff like Tor, “Darknet” forums, bitcoin and other grey online communities. Again I had played around with most of this (and read books etc), and again the advanced course is again aimed at all skill levels but there was a good mixture of theory and practice.

Over the two days you get to mess around (within legal guidelines) with different “Darknets” such as Tor, I2P, Freenet and GNUnet which is quite fun and interesting to see how mature some of the others are in comparison to Tor. You also learn about the history of Bitcoin (which in itself is quite interesting) and look at the different ways to track bitcoin transactions and all the other various “alt coins” that have sprung up (did you know there was a Bieber coin (BRC)??).

Summary:

For me I really enjoyed both courses, yes I knew a lot of the stuff already (not being big-headed) but it’s always good to have an expert validate what you know, especially when you have taught yourself most of it (and now I have certificates). It also introduced me to lots of new OSINT resources (websites, tools etc) as well as helped focus the “flow” of the data you can collect, and some better ways of processing it and reusing it for other purposes. Plus it also opened up lots of other opportunities for new Maltego transforms (I bet you thought this was a Maltego free blog post).

Pros:

  • Excellent instructor
  • Good facilities (lots of free coffee and biscuits)
  • Good course content which was delivered well
  • Fun (that’s important)

Cons:

  • Might not be the course for you if you already do OSINT related work or have a deep technical background around “Cyber”

If you have any questions or queries just give me a shout.

Adam

Maltego Magic comes to BSides London

I’m a big fan of BSides London, it was the first security conference I ever went to, and this will be my fourth year attending. The last couple of years I’ve been a “crew” member for the event, working in the background to help make the event what we all know and love. Last year I stepped in last-minute to run a Scapy workshop, this year I’ve decided to submit one, on my other favourite thing Maltego.

Below you will find a brief description of the workshop and the things that if you are planning on attending you will need to bring with you.

Maltego Magic – Creating transforms & other stuff

In this workshop I will teach people how to write their own Maltego transforms. Using simple to understand examples (and pictures, everyone likes pictures) I will lead the participants through the process of creating local and remote transforms using just a pen and paper (ok a laptop is needed as well).

A basic knowledge of Maltego & Python is needed but the workshop will be aimed so that anyone can benefit from the magic that is Maltego even if they haven’t coded anything before.

Requirements for the day

  • Laptop (Mac OSX, Windows or Linux)
  • Python installation (2.7 or above, not version 3 though)
  • The Python Requests library (sudo pip install requests)
  • Maltego (CE edition is ok)
  • A text editor that’s Python friendly or a Python IDE (Sublime Text, PyCharm etc)
  • Your imagination (borrow someone else’s if necessary)

The workshop information for the day is below:

Date: June 3rd 2015
Workshop: 3
Track No: 2
Duration: 2 hours
Schedule for: 14:00 – 16:00

If you are interesting in writing Maltego transforms come along to the workshop, if you can’t make it I will be wandering around the con all day so feel free to stop me and we can have a chat about Maltego Magic.

sniffMyPackets V2: Database or not??

When I started the work on sniffMyPackets version 2 I decided to make it default, to using a database backend. The decision around this was based on trying to get the most out of the pcap files without crowding the Maltego graph. I knew at the time that this means that people who want to use my code would have to have additional infrastructure for it to run. I’ve tried to minimize the pain by introducing a Vagrant machine that will build a MongoDB database instance and install the new web interface (which is awesome).

Recently I was asked if you “HAD” to use a database and I realised that this decision, might limit the number of people who will use sniffMyPackets. Which that in mind I have now rewritten the Maltego transforms to work with or without a database. The limitation is that don’t get to use all the transforms (such as replaying a session) but you still get the same output on the Maltego graph.

The changes have been pushed to the GitHub repo which is linked below:

https://github.com/SneakersInc/sniffmypacketsv2

By default the database support is switched off, the good news is that it only takes one change to enable it. When you create the Canari profile a sniffmypacketsv2.conf file is created in the src/ directory. So for example:

canari create-profile sniffmypacketsv2 -w /root/localTransforms/sniffmypacketsv2/src

This creates the sniffmypacketsv2.conf file in the /root/localTransforms/sniffmypacketsv2/src directory. The configuration file comes preconfigured with several options most of which you can change to meet your needs. The important one for the database support is under the [working] section and is called ‘usedb’ (see pictures below).

To enable database support, simply change the value from 0 to 1 and you are good to go (so to speak). You can turn it on and off whenever you want.

Database Support Off:

databaseoff

Database Support On:

databaseon

Any Maltego transform that won’t work without the database just returns a message to your Output window about needing database support. To be honest most of the important ones work without database support so you should notice much difference.

If you decide not to use the database then you will be missing out on this cool looking web interface (see below)

coolwebinterface

Over the next few weeks, more Maltego transforms will be created/added and I’ve got some cool features planned in terms of Maltego TDS based transforms and for the web interface as well. I’m also going to start working on adding NetFlow support to sniffMyPackets as well so you can throw even more stuff into the mix.

I will also be working on the documentation and will run a blog series on some of cool features you might not notice straight away.

Oh my online demo site is still online (which sometimes stops working but that’s not my code more the server) and you can find it at:

http://demo.sniffmypackets.net

As always feedback is welcome (good or bad).

Enjoy.

sniffMyPackets Version 2 – The Release

If you follow me on Twitter you have probably noticed me bombarding you with tweets about the next release of sniffMyPackets, well today it’s officially released in a Beta format. There is still a lot of work to be done and over the next few weeks/months expect a lot of changes. The purpose of this blog post today is to give you an overview about what this new version is all about, my hope is that you download it, try it and let me know what you think.

DISCLAIMER:

This is a beta, I’ve tested the code as much as possible but you might still have issues, if you do please raise an Issue ticket on the respective GitHub repo.

What’s New:

The original sniffMyPackets was the first piece of code I wrote and it was really more a journey of discovery than an attempt at producing anything that could be classed as “production ready”. This release of sniffMyPackets is my attempt at turning it into something that is not only useful (well hopfully) but also that can be used and shared among analysts within an organisation.

There are two main changes to this release (aside from better Python code), the first is the use of a database backend (currently MongoDB), the second is the introduction of a web interface. These two components work in harmony with the Maltego transforms to give you more information without overwhelming the Maltego graphs but still giving you the visibility you need from within Maltego.

Database:

The database is used by the Maltego transforms to store additional information when transforming a pcap file. When building Maltego graphs you need to be able to tell a story but at the same time not overwhelm the graph with too much information (especially when building large graphs). Each transform when run not only returns a Maltego entity but also writes additional information back into the database. You can then use this information to “replay” a pcap file without needing the actual pcap files (just the Session ID).

For example, when you first add a pcap file into a blank graph you need to index it. This process generates a unique Session ID for that pcap file (based on MD5 hash) which is what is returned to your Maltego graph. What also happens is that additional information is written to the database, which is then displayed in the web interface. Below is the information collected from just that single transform:

  • MD5 Hash
  • SHA1 Hash
  • Comments
  • Upload Time
  • File Name
  • Working Directory
  • Packet Count
  • First Packet
  • Last Packet
  • PCAP ID (or Session ID)
  • File Size

While you may think this slows the transform down, I’ve made ever attempt for that not to be the case. A lot of the Python code for this release has been rewritten to speed execution and remove redundant code.

Web Interface:

Personally this part has been the greatest challenge for me (but at the same time it’s awesome), the whole purpose behind this addition is to allow you to share the analysis of a pcap file with anyone. As you run transforms from Maltego on a pcap file the web UI is populated with the “detailed” information. Even if you close the Maltego graph and delete the pcap file that information is still there and you can still “replay” that back into Maltego (cool or what..).

The web UI has some cool features:

File upload functionality so you can upload pcap files or artifacts from Maltego to the web server to allow for other people to download them.
Full packet view, when a TCP/UDP stream is indexed in Maltego you can view each individual packet from the web interface (based on Gobbler).

Other Stuff:

A lot of the Python code has changed, for example the old method of extracting TCP & UDP streams using tshark has been replaced with pure Python code, this means it’s quicker and removes the need to have tshark/wireshark installed (and stops issues arising when the command syntax changes). The way that sniffMyPackets checks the file types has been turned into a Python function, likewise the way that we check to make sure a pcap file isn’t a pcap-ng format is now a pure Python function.

I’ve also started to write 3rd party API plugins as well, currently you can submit an artifact MD5 hash to VirusTotal to see if there is an existing report available. In the future the functionality to upload the file to VirusTotal will also be added. I’m also planning on adding CloudShark functionality so you can upload files to a CloudShark appliance or their cloud service.

Vagrant Machine:

Now one of my concerns when I decided to add a web interface and a database backend was whether or not people had the infrastructure available to do accomodate this, so in order to try and make it more accessible I created a Vagrant machine. Vagrant is an awesome tool if you do development work, it allows you to script the creation and build of a machine that is repeatable and gives the same results time after time. Within the sniffMyPackets-web repo there is a vagrant folder. If you have Vagrant & Virtualbox installed (required I’m afraid), you can simply type (from within that vagrant folder):

vagrant up

This will create a Linux server, download, install and configure the MongoDB database and then the required libraries for the web interface before starting it running. The initial vagrant up can take about 20 minutes (as it has to download stuff) but after that your vagrant up will have your server ready in minutes.

Maltego Machines:

There are currently two Maltego Machines available. Both serve a different purpose (kind of obvious I know).

[SmP] – Decode pcap: This is run after you have indexed a pcap file, it will extract all the TCP/UDP streams, IPv4 Addresses, DNS requests, HTTP requests etc. straight into your Maltego graph.

[SmP] – Replay Session (auto): If you want to replay a pcap file you can add a Session ID entity (with a valid Session ID), return the pcap file and then run this Maltego Machine against the pcap file. This runs every 30 seconds (useful if you the community edition of Maltego) and will extract all the information in the database for that pcap file into a Maltego graph. This is awesome if you don’t have the original pcap files or what to review information at a later stage that is already in the database.

How to get it:

The two GitHub repo’s can be found here:

Maltego Transforms: https://github.com/SneakersInc/sniffmypacketsv2
Web Interface: https://github.com/SneakersInc/sniffmypacketsv2-web

I’m in the process of adding documentation and all that fluffy stuff so bear with me, but if you’ve used sniffMyPackets before then install instructions are the same.

If you want to see it in action there is YouTube video below:

2014 in review

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 28,000 times in 2014. If it were a concert at Sydney Opera House, it would take about 10 sold-out performances for that many people to see it.

Click here to see the complete report.

Scapy: Sessions (or Streams)

I’m in process of rewriting sniffMyPackets version 2 (and yes I’m actually doing it this time), part of work involves doing a complete rewrite of the Scapy based code that I used in the original version. This isn’t to say anything is wrong with the original version but my coding “abilities” has changed so I wanted to make it more streamline and hopefully more modular.

In the original sniffMyPackets I used tshark to extract TCP and UDP streams, which wasn’t the ideal solution however it worked so.. This time around I’m hoping to cut out using tshark completely so I’ve looking at what else Scapy can do. There is a “hidden” (well I didn’t know it was there) function called Sessions, this takes a pcap file and breaks it down into separate sessions (or streams). The plus side of this is that it’s actually really quick and doesn’t limit itself to just TCP & UDP based protocols.

I’m still playing with the function at the moment but I’m hoping that I can use this for sniffMyPackets, so I thought I would share my findings with you.

First off lets load a pcap file into Scapy.

>>> a = rdpcap('blogexample.pcap')

This is just a sample pcap file with a ICMP ping, DNS lookup and SSL connection to google.co.uk.

If you want to check the pcap has loaded correctly you can now just run.

>>> a.summary()

Cool so lets load the packets into a new variable to hold each the sessions.

>>> s = a.sessions()

This loads the packets from the pcap file into the new variable as a dictionary object, which I confirmed by during.

>>> print type(s)
<type 'dict'>

Now lets get a look at what we have got to work with now.

>>> s
{'UDP 192.168.11.228:21893 > 208.67.222.222:53': ...SNIP...

That’s just the first session object from the pcap and it contains two parts (as required for a dictionary object). The first is a summary of the traffic, it contains the protocol, source IP address and port (split by a colon) and then the destination IP address and port (again split by a colon). The second part of the dictionary object is the actual packets that make up the session.

Lets have a look at how to make use of the first part of the dictionary object (useful if you want a summary of the sessions a pcap file).

>>> for k, v in s.iteritems():

>> print k
>>
UDP 192.168.11.228:21893 > 208.67.222.222:53
UDP 192.168.11.228:44710 > 208.67.222.222:53
TCP 192.168.11.228:53482 > 208.180.20.97:443
UDP 208.67.222.222:53 > 192.168.11.228:12321
...SNIP...

This is a nice and easy iteration of the dictionary object to get all the key within it (the summary details). You can of course slice and dice that as you see fit to make it more presentable. The other part of the sessions dictionary is the actual packets, using a similar method we can iterate through the sessions dictionary and pull out all the raw packets.

>>> for k, v in s.iteritems():

>> print v
>>

This will output a whole dump of all the packets for all of the sessions which isn’t that great. Lets see if we can make it a bit more friendly on the eye.

>>> for k, v in s.iteritems():

>> print k
>> print v.summary()
>>
...SNIP...
TCP 208.180.20.97:443 > 192.168.11.228:53482
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 SA
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 A
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 A / Raw
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 A / Raw
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 PA / Raw
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 PA / Raw
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 PA / Raw
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 A
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 PA / Raw
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 A
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 PA / Raw
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 PA / Raw
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 A
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 PA / Raw
Ether / IP / TCP 208.180.20.97:https > 192.168.11.228:53482 PA / Raw
...SNIP...

Now isn’t that much better, as you can see from the snippet above the session it’s captured starts at the SYN/ACK flag (probably because I started my capture after the SYN was sent) so in reality (and with a bit more testing) it looks like the Scapy Sessions function does allow for extracting TCP streams (well all streams really).

Lets have a go at something else.

>>> for k, v in s.iteritems():

>> print v.time
>>
AttributeError: 'list' object has no attribute 'time'
>>>

Oh well I was hoping that I could get the time stamp for each packet but it seems that you need to do some extra work first.

>>> for k, v in s.iteritems():

>> for p in v:
>> print p.time, k
>>
1417505523.55 UDP 192.168.11.228:21893 > 208.67.222.222:53
1417505522.45 UDP 192.168.11.228:44710 > 208.67.222.222:53

Oh that works much better, because the packets in the second half of the dictionary are a list object we need to iterate over them to get access to any of the values within them (such as the packet time).

So there you go, I’m off to see what else I can do with this..

Enjoy.

Maltego: Going Remote..

Today we are going to talk about TDS (Transform Distribution Server) transforms, as we discussed in earlier posts these bad boys run on a remote server and allow you to easily share your transforms with anyone (well people you want to).

There are some considerations you need to bear in mind when building TDS transforms.

The server you are hosting them on needs to have all the dependencies required to run (common sense really)
If you are hosting transforms that are public and they spam the crap out of someone else’s systems you are probably the one going to get the blame
Use a local development server to test them before pushing to live and think about using a private code repository (trust me it’s painful otherwise)

You also need to think about what you want your TDS transform to do, for example TDS transforms (Public ones) are great for web-based queries (API calls etc) but you probably wouldn’t want to run anything that was CPU intensive or needed an asset uploaded from your local machine in order to work. You can purchase an internal TDS server from Paterva which might remove some of these restrictions (not sure how they work to be honest).

I’ve not written a lot of TDS transforms (at the moment) but in this post we are going to take the GetRobots local transform we’ve been working on and move it to a TDS server. In order to use TDS transforms you need the following:

I’m not going to cover building a web server to host the WSGI application as the documentation linked above does a pretty good job (it worked for me so you should be fine). What we are going to look at today is how to take the local transform and convert it to a TDS transform and what that means in terms of code changes (it’s not that much actually).

So going back to the very first iteration of the GetRobots transform, lets just have a look at the code we wrote.

#!/usr/bin/env python

# Maltego transform for getting the robots.txt file from websites

from MaltegoTransform import *
import os
import requests

website = sys.argv[1]
m = MaltegoTransform()
robots = []

try:
  r = requests.get('http://' + website + '/robots.txt')
  if r.status_code == 200:
    robots = str(r.text).split('\n')
    for i in robots:
      m.addEntity('maltego.Phrase', i)
  else:
    m.addUIMessage("No Robots.txt found..")
except Exception as e:
  m.addUIMessage(str(e))

m.returnOutput()

Only 24 lines of code, the good news is that we are not going to have to change a lot to make this a TDS transform. Lets jump straight in and see what the TDS version will look like.

import requests
from Maltego import *

robots = []

def trx_GetMeRobots(m):
  TRX = MaltegoTransform()
  try:
    website = m.Value
    TRX.addUIMessage(str(website))
    url = 'http://' + website + '/robots.txt'
    TRX.addUIMessage(str(url))
    r = requests.get(url)
    if r.status_code == 200:
      robots = str(r.text).split('\n')
      for i in robots:
        ent = TRX.addEntity('maltego.Phrase', i)
      else:
        TRX.addUIMessage("No Robots.txt found..")
  except Exception as msg:
    TRX.addUIMessage("Error:"+str(msg),UIM_PARTIAL)
         
  return TRX.returnOutput()

See told you it wasn’t that bad, lets look at the important changes.

from Maltego import *

Here we have change the Maltego module we import, the TDS based transforms use a different Python library (which has better functionality in some areas).

def trx_GetMeRobots(m):

For the TDS transforms we use a Python function, the name is of your choice but you need to ensure you have the ‘(m)’ defined as this is used to pass the entity value(s) from your Maltego graph into the transform.

TRX = MaltegoTransform()

We’ve used TRX as the variable for the MaltegoTransform() call rather than ‘m’ as we already used ‘m’ in the function and it helps me remember its a TDS transform. Again we use a try/except loop and as you can see the only real difference is that I’ve added some ‘UIMessage’ updates along the way (which was more for debugging).

TRX.addUIMessage(str(website))
TRX.addUIMessage(str(url))

So hopefully by this point you should have the following:

  • A web server configured to work with the TDS package
  • Your new TDS transform written

The next part is to tweak the TRX.wsgi code to make use of our new transform. The code is below:

import os,sys

# To run in debug, comment the next two lines and see 
# end of this file

os.chdir(os.path.dirname(__file__))
sys.path.append(os.path.dirname(__file__))

from bottle import *
from Maltego import *

# Transform Libraries goes here:
from GetRobots import *


##### TRANSFORM DISPATCHER starts here -->

#-----> GetRobots - Collect contents of robots.txt from a website
@route('/GetRobots', method='GET')
def GetMeRobots():
  if request.body.len>0:
    return(trx_GetMeRobots(MaltegoMsg(request.body.getvalue())))


## ---> Start Server
## To start in debug mode: Comment the line below...
application = default_app()

## ... and uncomment line below
#run(host='0.0.0.0', port=9001, debug=True, reloader=True)

Most of the code above is the default so I’m just going to cover the bits I’ve changed. The first bit is this:

from GetRobots import *

Essentially this imports our GetRobots code so it’s available to be called by the transform dispatcher. Next we create a new route so we can call the transform from within Maltego.

#-----&gt; GetRobots - Collect contents of robots.txt from a website
@route('/GetRobots', method='GET')
def GetMeRobots():
if request.body.len&gt;0:
return(trx_GetMeRobots(MaltegoMsg(request.body.getvalue())))

The ‘@route(‘/GetRobots’, method=’GET’)’ statement is the URL that you want to reference your transform on, the default for the ‘method’ is “ANY” but I’ve changed it to “GET” so it’s not as open. The ‘def GetMeRobots():’ function is just to hold our call to the transform, the first line of the function ‘if request.body.len>0:’ makes sure that we only perform the call to the transform if the body length isn’t blank (or zero).

The important part of the next line of code is the call to your transform (based on function name) ‘return(trx_GetMeRobots(MaltegoMsg(request.body.getvalue())))’. You need to ensure that the name of the function, in this case ‘trx_GetMeRobots’ matches then name of your function for your transform.

Still with me? So we now have the following all setup and ready to roll.

    • A web server configured to work with the TDS package
    • Your new TDS transform written
    • A modified TRX.wsgi configured for your new transform

So there are two steps left, we now need (using our TDS account) to set up the Seed & Transform. This is a quick process. Log into your TDS account (https://cetas.paterva.com/TDS/)

Screen Shot 2014-10-08 at 12.23.51

Select “Seeds” from the menu, then click on the “Add Seed” button.

Screen Shot 2014-10-08 at 12.24.51

We need to fill in some basic details.

Seed Name – A name for your seed, you can create seeds based on the transform roles or environments
Seed URL – Create a unique name which will act as the path to your transforms. If you want them “private” make the URL suitably long and stupid.
Transforms – This will be blank at the moment

Click on “Add Seed” and then all being well you should get a message saying the seed was added successfully. Next we need to add the transform, there isn’t an option for “Home” so you need to click on the hyperlink at the top of the page labelled ‘Paterva Transform Distribution Server 0.1’.

Screen Shot 2014-10-08 at 12.48.46

Back on the home page, click on ‘Transforms’, then click on “Add Transform”.

Screen Shot 2014-10-08 at 12.50.41

We now need to add in some details, we will do the minimum just to save on time (and me typing).

Transform Name – The name of the transform. It doesn’t support spaces or non alpha numeric characters.
Transform URL – This is the URL to YOUR server running the transform and the name of the path to the transform (based on what you entered in the TRX.wsgi file).
Input Entity – For our purpose that is maltego.Website, you can add your own entities as well
Owner – This is based on your TDS account details.
Disclaimer – By default a disclaimer is added but you can add your own text as well.
Description – An optional area to put a more detail about what your transform does.
Version – Again optional, good if you are making big changes just so people are aware.
Author – Again pulls these details from your TDS account.
Debug – Tick to enable debugging (explains itself really).
Transform Settings – We haven’t defined any and I haven’t worked out what they are really for.
Seeds – You need to select the seed you created, this adds the transform to that seed.

Screen Shot 2014-10-08 at 13.03.14

Click “Add Transform”, and again you should get a message something like ‘Successfully Added [TransformName]’

The final task is to add the Seed into Maltego so it will add the transform and make it available. This is a nice simple process and you just need the full Seed URL, here’s one I built earlier.

https://cetas.paterva.com/TDS/runner/showseed/49f742b02c27ab2a88b4299a4fb56f31abc6298c

Load up Maltego and click:

Manage – Discover Transforms – Discover Transforms (advanced)

Screen Shot 2014-10-08 at 13.06.23

You will now get a little wizard which only needs two pieces of information.

Name – You can call this what you what
URL – The Seed URL

Screen Shot 2014-10-08 at 13.08.24

Click Add and then Next

Screen Shot 2014-10-08 at 13.08.31

The wizard will go off and verify all the Seeds are available and then present you with a list.

Screen Shot 2014-10-08 at 13.08.57

All you need to do on this is click Next to carry on (they should all be ticked). This will produce a list of all the transforms that will be loaded into Maltego. Scroll through the list and you should be able to see this.

Screen Shot 2014-10-08 at 13.11.11

Click Next and it will import tha transforms. All being well you should get a similar window to the one below and that is it.

Screen Shot 2014-10-08 at 13.12.22

Now to test it, create a new graph and drag a Website entity onto it, right-click and select.

Run Transform – Other transforms – GetRobots

Screen Shot 2014-10-08 at 13.13.25

The first time you run a TDS transform you will be prompted to accept the disclaimer which if you want it to run you should accept.

Screen Shot 2014-10-08 at 13.14.41

This will run the GetRobots transform from the remote TDS server and should produce the same results as the local transform we wrote.

Screen Shot 2014-10-08 at 13.15.41

So there you go, your first TDS transform now wasn’t that fun. The final blog post in the Maltego series will go over exporting your entities, transform sets and the such so that other people can use them.