Planet Redpill Linpro

27 January 2012

Magnus Hagander

Finding gaps in partitioned sequences

There are an almost unlimited number of articles on the web about how to find gaps in sequences in SQL. And it doesn't have to be very hard. Doing it in a "partitioned sequence" makes it a bit harder, but still not very hard. But when I turned to a window aggregate to do that, I was immediately told "hey, that's a good example of a window aggregate to solve your daily chores, you should blog about that". So here we go - yet another example of finding a gap in a sequence using SQL.

I have a database that is very simply structured - it's got a primary key made out of (groupid, year, month, seq), all integers. On top of that it has a couple of largish text fields and an fti field for full text search. (Initiated people will know right away which database this is). The sequence in the seq column resets to zero for each combination of (groupid, year, month). And I wanted to find out where there were gaps in it, and how big they were, to debug the tool that wrote the data into the database. This is really easy with a window aggregate:


SELECT * FROM (
   SELECT
      gropid,
      year,
      month,
      seq,
      seq-lag(seq,1) OVER (PARTITION BY groupid, year, month ORDER BY seq) AS gap FROM mytable
) AS t
WHERE NOT (t.gap=1)
ORDER BY groupid, year, month, seq
 

One advantage to using a window aggregate for this is that we actually get the whole row back, and not just the primary key - so it's easy enough to include all the data you need to figure something out.

What about performance? I don't really have a big database to test this on, so I can't say for sure. It's going to be a sequential scan, since I look at the whole table,and not just parts of it. It takes about 4 seconds to run over a table of about a million rows, 2.7Gb, on a modest VM with no actual I/O capacity to speak of and a very limited amount of memory, returning about 100 rows. It's certainly by far fast enough for me in this case.

And as a bonus, it found me two bugs in the loading script and at least one bug in somebody elses code that I'm now waiting on to get fixed...

by Magnus Hagander at Fri 27 Jan 2012, 16:53

23 January 2012

Trygve Vea

New Munin-plugin for HAProxy

I committed a new Munin-plugin for HAProxy. It’s a multigraph plugin, it discovers all the configured frontends and backends automatically – all you need to provide is the username/password for the haproxy status page.

It produces 8 graphs, + subgraphs for some of the backends, where it presents the same graphs, but with server-specific metrics.

Some of the root-graphs:









Do you use HAProxy and Munin? Check it out!

by Trygve Vea at Mon 23 Jan 2012, 06:00

19 January 2012

Ingvar Hagelund

Finding what binaries to restart

When I started working with Linux system administration a few years ago, restarting services after a package upgrade was fairly easy. If the package didn’t restart itself, one could always ask lsof for help:

lsof +L1 | egrep 'bin/|lib/'

Now, on later Linux distributions, the usage of prelink has changed this, so one usually gets a lot of false positives, and could never trust the result of that good old lsof output.

Finding running executables is possible using some perl magic (Yes, I’m pretty sure you perl guys can write this more compressed) along the lines of this, at least on RHEL5:

perl -e ' for $i (glob "/proc/[1-9]*/exe") { $f=readlink $i; if ( $f=~ /([^\0]+)\0.*deleted/ ) { print "$1\n" }} ' | sort | uniq

But this won’t help us finding what service to restart after a dependency library was updated. So I finally wrote this script to help me. My boxes are mostly Fedora and RHEL, so it uses the fact that installed binaries on Red Hat based systems have installation time stored in the rpm database (rpm tag %{INSTALLTIME} ). This script abuses rpm heavily, and may use some time to finish on a busy system.

http://users.linpro.no/ingvar/check_newlibs

Test run:

[root]# /home/ingvar/check_newlibs
Warning: Needs restart: /sbin/agetty, pids 6067
Warning: Needs restart: /usr/bin/tail, pids 7315
Warning: Needs restart: /usr/bin/vim, pids 19759
Warning: Needs restart: /usr/sbin/sendmail.sendmail, pids 10645 10637
Warning: Needs restart: /usr/sbin/acpid, pids 5259
Warning: Needs restart: /usr/sbin/crond, pids 5567
Warning: Needs restart: /bin/bash, pids 26074 17731 16848 15718 30753 6120 32704
Warning: Needs restart: /sbin/mingetty, pids 6071 6069 6076 6068 6072 6070
Warning: Needs restart: /sbin/portmap, pids 5082
Warning: Needs restart: /usr/sbin/smartd, pids 20948
Warning: Needs restart: /sbin/multipathd, pids 20170
Warning: Needs restart: /usr/sbin/atd, pids 5969
Warning: Needs restart: /usr/sbin/sshd, pids 19863
Warning: Needs restart: /usr/libexec/mysqld, pids 17775

by ingvar at Thu 19 Jan 2012, 21:39

17 January 2012

Edward Bjarte Fjellskål

PassiveDNS 0.2.9

I added some features and changes to PassiveDNS. The most important change is that the output now contains the TTL value, so you need to use the current tools/* (if you use them) as they are also changed to work with this new output format (or update your own tools).

I also added the ability to specify the DNS record types that you want to log from the command line and I added support for more record types. PassiveDNS now should be able to track: A, AAAA, CNAME, DNAME, NAPTR, SOA, PTR, RP, SRV, TXT, MX and NS.

Support for chroot and dropping privileges are also added.

I also added some features to tools/pdns2db.pl while I was at it:
1) You can now process a passivedns.log file in “batch” mode, exiting when finished.
2) You can now specify a file with a list of domains or IPs to skip insertion to the DB.
3) You can now specify a file with a list of PCRE (Perl Compatible Regular Expressions) of “domains/IPs” to skip insertion to the DB.
4) You can now specify a file with a list of domains or IPs to alert on!
5) You can now specify a file with a list of PCRE of “domains/IPs” to alert on!
6) You can now specify a file with a list of domains to whitelist and not alert on.
7) You can now specify a file with a list of PCRE of “domains/IPs” to whitelist and not alert on.

The skiplists will be checked first, and if the domain/IP is found/matched there, whitelist and blacklist will be ignored and insertion to DB will be ignored.

Next the whitelists will be checked, and if a domain/IP is found there or match a PCRE that you have defined it will not be checked by the blacklist.

Last the blacklists is checked, and if a domain/IP is found there or match a PCRE that you have defined, it will write the PassiveDNS record to the alert file that you specify (Default: /var/log/passivedns-alert.log).

There are different sources for getting lists of known bad domains. Here is one if you want to test the blacklist functionality: http://isc.sans.edu/feeds/suspiciousdomains_High.txt

Im pretty far as what it comes to planed features at this stage. Please try out PassiveDNS and beat the crap out of it :) I will probably “up” the version to 0.5.0 soon and from there on, it is just testing and testing and more testing before it will be a “one dot O” release.

If you have any issues with PassiveDNS, please submit them here.

by Edward Bjarte Fjellskål at Tue 17 Jan 2012, 15:17

12 January 2012

Edward Bjarte Fjellskål

PRADS, and how it compares to pads and p0fv2 and p0fv3

The question was brought up to me late last night on IRC, as p0fv3 RC was recently announced. This is a short answer to that question:

“People that find the PRADS page and already know p0f or pads may be interested in a comparison or essentially arguments why you would use one over the other.”

First off, its exiting to see Michal Zalewski back with p0fv3 :) I quickly read through his code yesterday and tested it out, and its rather interesting how he solves things. The fingerprint database at the moment is limited, but expect that to grow in the near future. I also love his non formal output in his applications :)

[PRADS vs PADS]
So, back to the questions. First off, pads “Passive Asset Detection System” uses regexp syntax to look for common bytes in payload to identify server application. So if the server says ” Server: Apache/2.2.3 (Linux/SUSE)” that is collected as what service is running on the server port where this was detected. The “rules” can be written more specifically for each server software, but are rather general and small today. Some pads “rules” look for ASCII strings, and some for different bytes in hex etc. to identify stuff like SSL/TLS. Pads is no longer actively developed by the original author, but I do maintain a fork of the last version with enhancements added.

PRADS extended the way pads does asset detection. We have build in IPv6 support in PRADS, so it also detects asset listening on IPv6 addresses. We also have build in connection tracking, so that we can cut off detection in a stream after an amount of packets or bytes seen from the client or server. This to drop trying to look for server/client assets in connections that transfers big files or are encrypted etc. Most “banners/identifiers” are in the first packet etc. so limiting for how many packets in a stream to do detection on helps on performance etc.

To extend pads a bit, we also added detection for client applications using the same method as for detecting server applications.

My future thoughts on enhancing the pads/PRADS asset rules are to make them more like the Snort/Suricata rule language and use fast pattern matching before invoking the pcre method etc. Pads does no OS fingerprinting per say btw.

[PRADS vs p0f]
PRADS tcp fingerprinting was based on the p0fv2 way as p0f had the fingerprint DB and we thought that reusing the fingerprints would make it easier for people to migrate if they wanted, instead of recollecting and adding fingerprints. PRADS also added some touches of its own (for IPv6 etc) and the way we match the fingerprints (and fuzzing). We have thought about extending the fingerprints and re-write them, but thats in the future. Right now they are doing a good job. We also added all the p0fv2 ways of fingerprinting to the whole tcp session, from the syn to the rst/fin. p0fv2 could just use one method at a time, depending on how you started p0fv2. PRADS outputs all the info it gathers, and leaves the final correlation to the end user/program etc. A good example on that is prads-asset-report and prads2snort, which ads wight to each type of fingerprints, ranging the syn and syn+ack higher than stray-ack, rst and fin etc. You can also base the final guess on client or server applications to, say if the User-Agent contains: “Linux” or “Windows NT 6.1″ or “Macintosh; Intel Mac OS X 10.7″ etc.
or if the Server string of the web server is: “Microsoft-IIS 6.0″ or “Apache 2.2.15 (FreeBSD)” or “Apache 2.2.3 (Red Hat)” etc.

The p0fv3 tcp fingerprints are new in the way they are written. A new fingerprint file format, that makes it easy to add different types of fingerprints into one and same file (TCP/HTTP/SMTP etc). The most significant enhancement in the TCP fingerprints that I see is the MSS and MTU multiplier field. p0fv3 also detects new quirks not measured in p0fv2. The rules are now also more human readable, Example:

# RULE
label = s:unix:Linux:2.6.x
sig = *:64:0:*:mss*4,6:mss,sok,ts,nop,ws:df,id+:0

# Will match:
.-[ X.X.X.X/58435 -> Y.Y.Y.Y/22 (syn) ]-
|
| client = X.X.X.X/58435
| os = Linux 2.6.x
| dist = 9
| params = none
| raw_sig = 4:55+9:0:1460:mss*4,6:mss,sok,ts,nop,ws:df,id+:0
|
`—-

The way the tcp fingerprints are matched are also changed a bit, and I believe Michal Zalewski has done this for good reasons and that it will enhance the detection.

Beside the new tcp fingerprint changes, p0fv3 also has application layer detection added. I looked at the HTTP stuff, and p0fv3 matches also on the HTTP header order and dont blindly trust the User-Agent, as we do in PRADS. We have thought about extending the “rule/signature” in PRADS to be more Snort/Suricata like, so you can have more content matches etc, but more accuracy can be achieved today using the pcre language, to verify header order etc, before blindly trusting the UA, but pcre is way too expensive used alone I think, so organizing the signatures/rules better internally and having something like a fast_pattern matcher would help alot. Quick pcre example for a User-Agent with specific HTTP header order:

# Detects Firefox/3.6.X with HTTP header order to add confidence in the match.
# PRADS rule:
http,v/MFF 3.6.X/$1//,\r\nHost: .*\r\nUser-Agent: Mozilla\/5\.0 (.*Firefox\/3\.6\..*)\r\nAccept: .*\r\nAccept-Language: .*\r\nAccept-Encoding: .*\r\nAccept-Charset:

Running it in PRADS on an old pcap gives me:

# Client IPs deducted just to be kind
[client:MFF 3.6.X (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101027 Ubuntu/10.04 (lucid) Firefox/3.:80:6],[distance:8]
[client:MFF 3.6.X (X11; U; Linux x86_64; en-GB; rv:1.9.2.12) Gecko/20101027 Ubuntu/10.10 (maverick) Firefox:80:6],[distance:11]
[client:MFF 3.6.X (Windows; U; Windows NT 5.1; de; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12 ( .NET CLR 3.:80:6],[distance:10]
[client:MFF 3.6.X (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12 (.NET CLR :80:6],[distance:14]
[client:MFF 3.6.X (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101027 Linux Mint/10 (Julia) Firefox/3:80:6],[distance:15]
[client:MFF 3.6.X (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101027 Ubuntu/10.10 (maverick) Firefox:80:6],[distance:9]
[client:MFF 3.6.X (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12:80:6],[distance:6]
[client:MFF 3.6.X (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.6) Gecko/20100625 Firefox/3.6.6:80:6],[distance:12]
[client:MFF 3.6.X (Windows; U; Windows NT 6.1; es-ES; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12:80:6],[distance:14]
[client:MFF 3.6.X (X11; U; Linux x86_64; en-US; rv:1.9.2.10) Gecko/20101005 Fedora/3.6.10-1.fc14 Firefox/3.:80:6],[distance:8]
[client:MFF 3.6.X (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101027 Ubuntu/10.04 (lucid) Firefox/3.:80:6],[distance:12]

Not the whole User-Agent is grabbed, and we need to extend that in the future. But the pcre language makes it possible to match on as much content as you want, to have the confidence you need in your signatures/rules for detecting assets. PRADS looks for client and server applications on all ports and both UDP and TCP and for IPv4 and IPv6.

[PRADS vs The World]
Right now we are working on adding the DHCP OS fingerprinting and ICMP OS fingerprinting. DHCP is pushed to the git master on github but is not fully integrated into the PRADS core yet, but printing and matching is working, so you can help add fingerprints if you want :) . The ICMP part is tricky as I want to fingerprint on the protocol layer, and also the payload, so I kind of have to combine the p0f way with the pads way of detecting and matching.

PRADS has also lots of other stuff, like connection tracking/Flow gathering with output compatible with cxtracker and sancp. I have also been working on my passivedns project, and I tend to port the relevant function over to PRADS, so we can have domain names mapped with assets to.

p0fv3 has an API so you can talk to it, to fetch relevant info about the IPs it knows about. I see p0fv3 with this functionality aimed at mail and web servers etc, to determine if this is spam or ham stuff coming its way, but you can use it in lots of cool ways.
I know PRADS is used in much the same way from people I have talked too. An example that Kacper put up can be found on http://prads.delta9.pl/. On the road map for upcoming PRADS releases, we have access to assets via shared memory. That will make it easier for extracting info from the running PRADS process that is current. PRADS also ships with prads2db.pl which parses a prads asset log-file and inserts the info to a DB so you can query it for info.

PRADS philosophy is something like: “If it can be detect passively, PRADS should probably do it.”

So if you are comparing for deciding which application to go for, I would say use them all, and correlate the the knowledge that each tool gives you. You can even add the output from the active fingerprinting tool nmap into the mix.

That said, much of my view on PRADS comes from that I use it in my Network Security Monitoring setup and from my wish to “know as much as possible about my assets”. If you have any wishes or suggestions, god or bad etc, feel free to contact us.

E

by Edward Bjarte Fjellskål at Thu 12 Jan 2012, 10:49

09 January 2012

Edward Bjarte Fjellskål

Suricata and some phun with flowints

I have been looking into malware traffic that is hard to make signatures for in a “regular” way. I’m not a malware reverser, so I don’t dig into a malware to determine byte-testes and jumps etc. in binary protocols. This lead me to use a lot of flowbits at first, for making my sigs, but the performance in Snort and Suricata was “crap” to say it nice. So I talked to Victor Julien, lead programmer of Suricata, discussing implementing packet and byte counting in Suricata. I want to count each packet sent by a client and server and the total amount of bytes sent by client and server. Talking back and forth, Victor convinced me that I might be best to go for byte count for reassembled streams. So I added a feature request to Suricata. I since then updated the feature request to add the packet and byte counters, as I think they will do great use.

Talking to Matt Jonkman (Emerging Threats Pro), he pointed me to flowint in Suricata to try to solve my packet counting. So in Suricata 1.1.1, you can do something like this to initialize the packet counters:

# Initialize the packet counter (Suricata 1.1.1 and some older versions)
#alert ip $HOME_NET any -> $EXTERNAL_NET any (msg:”Generic Client Established Flow IP Packet Counter set”; flow:established,from_client; flowint:client_packet,notset; flowint:client_packet,=,0; flowbits:noalert; classtype:not-suspicious; sid:1; rev:1;)

#alert ip $EXTERNAL_NET any -> $HOME_NET any (msg:”Generic Server Established Flow IP Packet Counter set”; flow:established,from_server; flowint:server_packet,notset; flowint:server_packet,=,0; flowbits:noalert; classtype:not-suspicious; sid:2; rev:1;)

In Suricata 1.2dev (rev 4c1e417) (I did my test for the blog on this version) and newer, you dont need to initialize the counter, as it will automagical be initialized to zero, so you don’t need sid:1 and sid:2:

## Generic packet counter: (This could be better done internally in Suricata/Snort? and not with rules?)
alert ip $HOME_NET any -> $EXTERNAL_NET any (msg:”Generic Client Established Flow IP Packet Counter”; flow:established,from_client; flowint:client_packet,+,1; flowbits:noalert; classtype:not-suspicious; sid:3; rev:1;)

alert ip $EXTERNAL_NET any -> $HOME_NET any (msg:”Generic Server Established Flow IP Packet Counter”; flow:established,from_server; flowint:server_packet,+,1; flowbits:noalert; classtype:not-suspicious; sid:4; rev:1;)

So, what can you do with packet counters?

First off, lets look at some generic rules I made up to test with, which basically should limit the detections in streams to the first 29 packets from the client:

# GENERiC GET
alert http $HOME_NET any -> $EXTERNAL_NET any (msg:”GENERIC GET (classic)”; flow:from_client,established; content:”GET “; depth:4; content:!”connection: keep-alive”; nocase; http_header; classtype:not-suspicious; sid:5; rev:1;)

alert http $HOME_NET any -> $EXTERNAL_NET any (msg:”GENERIC GET (flowint)”; flow:from_client,established; flowint:client_packet,,30; content:”GET “; depth:4; content:!”connection: keep-alive”; nocase; http_header; classtype:not-suspicious; sid:6; rev:1;)

# GENERiC UA
alert http $HOME_NET any -> $EXTERNAL_NET any (msg:”GENERIC User-Agent (classic)”; flow:from_client,established; content:”User-Agent: “; http_header; content:!”connection: keep-alive”; nocase; http_header; classtype:not-suspicious; sid:7; rev:1;)

alert http $HOME_NET any -> $EXTERNAL_NET any (msg:”GENERIC User-Agent (flowint)”; flow:from_client,established; flowint:client_packet,,30; content:”User-Agent: “; http_header; content:!”connection: keep-alive”; nocase; http_header; classtype:not-suspicious; sid:8; rev:1;)

Sid 5 and 6 looks for a HTTP GET request that is not a HTTP keep-alive. Sid 7 and 8 is looking for User-Agent in non HTTP keep-alive request. Common for the flowint versions of the rules, are that they are just limited to the first 29 packets in an established flow. So running Suricata against 2009-04-20-09-05-46.dmp etc. shows some interesting results:

Num Rule Gid Rev Ticks % Checks Matches Max Ticks Avg Ticks Avg Match Avg No Match
——– ———— ——– ——– ———— —— ——– ——– ———– ———– ———– ————–
1 4 1 1 1695335708 67.74 510720 510720 6412616 3319.50 3319.50 0.00
2 3 1 1 581354624 23.23 508970 82175 3602972 1142.22 3061.99 772.59
3 7 1 1 135943292 5.43 7900 2352 499972 17208.01 16156.62 17653.74
4 5 1 1 43040648 1.72 3313 2517 199052 12991.44 16247.74 2694.82
5 8 1 1 29172972 1.17 7900 2352 434592 3692.78 6588.51 2465.18
6 6 1 1 17917112 0.72 3313 2517 353684 5408.12 6528.93 1864.06

Sorry for the formating :)
First, if we look at sid 5 and 6, we see that they both where checked 3313 times, and matched 2517 times. If we look at total ticks, sid 5 uses 43040648 ticks and sid 6 (flowint) uses 17917112 ticks. Average ticks for sid 5 is 12991.44 ticks and 5408.12 ticks for sid 6 (flowint).

Looking at sid 7 and 8, we see that they both where checked 7900 times, and matched 2352 times. If we look at total ticks, sid 7 uses 135943292 ticks and sid 8 (flowint) uses 29172972 ticks. Average ticks for sid 7 is 17208.01 ticks and 3692.78 ticks for sid 8 (flowint).

A basic conclusion for this test, is that the rules with the flowint check are faster and will give you the same alerts.
But if we look at the ticks sid 3 and 4 uses to count the all the packets, they are high in total, but low on average ticks. So they are not expensive for each check, but since they are checked (and possibly incremented) for each packet, the total ticks are relative high. Having this in the core of Suricata and Snort, would probably make them less expensive (hint hint).

So what more c00l stuff can we do with packet counters?

Some malware I stumbled upon will give you an example (Mostly used in the Gheg Spam bot, aka Tofsee/Mondera)
b31e4624cdc45655b468921823e1b72b
3c453e40ff63da3c2a914c29b6c62ee0
e8034335afb724d8fe043166ba57cd23

It seems to communicate in a binary way (encrypted), but looking at 5 different pcaps I got, I saw a pattern and my flowint counters came to good use. It seems like the client and server sends packets with a specific payload size in different parts of the communication. I did not see any obvious content to match on, so content matches didn’t seem trivial, and this is a great way to demonstrate my point: Flowint+packet counters to the rescue! Here is a tcpdump output of traffic on port 443 (not including the port 22050 traffic, which is much longer, but the start is the same), so you can see the packets sizes and in which order they do come in this short sessions:

reading from file b31e4624cdc45655b468921823e1b72b.pcap, link-type EN10MB (Ethernet)
03:47:02.571111 IP 192.168.1.10.1031 > 216.246.8.230.443: Flags [S], seq 910650996, win 65535, options [mss 1460,nop,nop,sackOK], length 0
03:47:02.608784 IP 216.246.8.230.443 > 192.168.1.10.1031: Flags [S.], seq 442582883, ack 910650997, win 5840, options [mss 1380,nop,nop,sackOK], length 0
03:47:02.608977 IP 192.168.1.10.1031 > 216.246.8.230.443: Flags [.], ack 1, win 65535, length 0
03:47:02.646959 IP 216.246.8.230.443 > 192.168.1.10.1031: Flags [P.], seq 1:201, ack 1, win 5840, length 200
03:47:02.647342 IP 192.168.1.10.1031 > 216.246.8.230.443: Flags [P.], seq 1:142, ack 201, win 65335, length 141
03:47:02.685098 IP 216.246.8.230.443 > 192.168.1.10.1031: Flags [.], ack 142, win 6432, length 0
03:47:02.718986 IP 216.246.8.230.443 > 192.168.1.10.1031: Flags [P.], seq 201:676, ack 142, win 6432, length 475
03:47:02.718999 IP 216.246.8.230.443 > 192.168.1.10.1031: Flags [F.], seq 676, ack 142, win 6432, length 0
03:47:02.719268 IP 192.168.1.10.1031 > 216.246.8.230.443: Flags [.], ack 677, win 64860, length 0
03:47:02.719584 IP 192.168.1.10.1031 > 216.246.8.230.443: Flags [F.], seq 142, ack 677, win 64860, length 0
03:47:02.757350 IP 216.246.8.230.443 > 192.168.1.10.1031: Flags [.], ack 143, win 6432, length 0

And here is how I sigged it:

# Backdoor:Win32/Tofsee (aka: Gheg / Mondera)
alert tcp $EXTERNAL_NET any -> $HOME_NET any (msg:”Possible Tofsee server Packet 2 (200 Bytes)”; flow:established,from_server; flowint:server_packet,=,2; dsize:200; flowbits:set,Tofsee_SERVER_200; flowbits:noalert; classtype:trojan-activity; sid:9; rev:1;)

alert tcp $HOME_NET any -> $EXTERNAL_NET any (msg:”Possible Tofsee client Packet 3 (141 Bytes)”; flow:established,from_client; flowint:client_packet,=,3; dsize:141; flowbits:isset,Tofsee_SERVER_200; flowbits:set,Tofsee_CLIENT_141; flowbits:noalert; classtype:trojan-activity; sid:10; rev:1;)

alert tcp $EXTERNAL_NET any -> $HOME_NET any (msg:”Possible Tofsee server Packet 4(475 Bytes)”; flow:established,from_server; flowint:server_packet,=,4; dsize:475; flowbits:isset,Tofsee_CLIENT_141; classtype:trojan-activity; sid:11; rev:1;)

Sid 9 looks only for the 2. packet in an established flow from the Server (C&C) and the packet has to have payload size/dsize 200. It then sets the flowbit Tofsee_SERVER_200 if this hits and the rule has noalert, because this could easily trigger a false positive just this check. So we got to do some more checks. Sid 10 checks only Client packet 3, it has to have a payload size/dsize of 141 and flowbit Tofsee_SERVER_200 has to be set for this too match. Sid 10 is also no alert, as we still can check some more, to not be spammed by falses. So sid 11 checks if server packet 4 has payload size/dsize 475, and that flowbit Tofsee_CLIENT_141 is set. No we can give an alert, as this would probably be an unique set of conditions. So testing again with out 2009-04-20-09-05-46.dmp test pcap, we get:

Num Rule Gid Rev Ticks % Checks Matches Max Ticks Avg Ticks Avg Match Avg No Match
——– ———— ——– ——– ———— —— ——– ——– ———– ———– ———– ————–
1 4 1 1 1727862376 63.39 510720 510720 14059784 3383.19 3383.19 0.00
2 3 1 1 508719672 18.66 508970 82176 3689732 999.51 2830.58 646.95
3 7 1 1 140271824 5.15 7900 2352 1013800 17755.93 18570.93 17410.42
4 9 1 1 101662288 3.73 28419 0 6625384 3577.26 0.00 3577.26
5 11 1 1 84264720 3.09 32938 0 612848 2558.28 0.00 2558.28
6 10 1 1 71553560 2.62 32938 0 576132 2172.37 0.00 2172.37
7 5 1 1 42053248 1.54 3313 2517 805736 12693.40 15831.10 2771.81
8 8 1 1 31547660 1.16 7900 2352 153972 3993.37 7039.04 2702.21
9 6 1 1 17944504 0.66 3313 2517 292508 5416.39 6476.95 2062.83

Overall, sid 9, 10 and 11 did not do that bad here. And the best thing is, they all have 0 matches. I ran this on many of my test pcaps, and I’ve not been close to false positives. Sid 10 seems to fire some times, but not the others, so rather unique combo of packets in a stream I guess and a way to sig malware like this. Also, we could add check for the TCP “PUSH” flag in sid 9, 10 and 11 etc to be more accurate if we need.

So the proof of the pudding, running it against a pcap of the malware:

Num Rule Gid Rev Ticks % Checks Matches Max Ticks Avg Ticks Avg Match Avg No Match
——– ———— ——– ——– ———— —— ——– ——– ———– ———– ———– ————–
1 3 1 1 443120 33.03 165 158 102108 2685.58 2731.72 1644.00
2 11 1 1 310420 23.14 259 2 2860 1198.53 2478.00 1188.58
3 4 1 1 302944 22.58 269 269 15376 1126.19 1126.19 0.00
4 10 1 1 257896 19.22 259 3 16484 995.74 7446.67 920.14
5 9 1 1 27088 2.02 10 3 7448 2708.80 5080.00 1692.57

Events:

[**] [1:11:1] Possible Tofsee server Packet 4(475 Bytes) [**] {TCP} 216.246.8.230:443 -> 192.168.1.10:1031
[**] [1:11:1] Possible Tofsee server Packet 4(475 Bytes) [**] {TCP} 84.16.252.136:22050 -> 192.168.1.10:1032

My Tofsee rules fire on all 5 pcaps I looked at initially (and lots more pcaps I tested after that), so hopefully it will fire on all current Tofsee traffic.

I also replied on an e-mail to the snort-user list 3. of November, making the same feature request as I did for Suricata. No one followed up :/ The email should probably be directed to the snort-devel list some time in the future…

I hope this post has been useful, and hopefully we can get some more flowint rules out there, and maybe even get native packet and byte counting in Snort and Suricata one day :)

by Edward Bjarte Fjellskål at Mon 09 Jan 2012, 07:09

30 December 2011

Edward Bjarte Fjellskål

Security thoughts for 2012+

Quoting Richard Bejtlich: “Prevention will eventually fail!”

And I have always agreed. Accidents do happen, the world is not perfect. So when companies that really spend time and money on security get breached (RSA, Lockheed, Google, [place your company here?]) then you could work out from the theory that you eventually will get breached too.

When you realize and accept that, you may need to redefine the way you think of IT security. You should prepare for the worst, so identifying what would be “the worst” for you (your company) and then identifying you most critical assets should be on the top of your list, and you should start focusing your effort on securing them the most.

Limit the users that have access to the most critical assets (and work on sensitive projects etc). The users also need special attention when it comes to awareness training and follow up. They should also have a good communication with the security staff making it easy to report anything that seems suspicious and get positive feedback no matter what. They are an important part of picking up security issues where your technology fails! So you need them.

The most critical assets needs to be monitored as close to real-time as it gets. The time it takes for an incident detection and till your response should be a minimum, even outside working hours and weekends.

Then the users who has access to theses critical systems should also have special attention/hardening on their OS’s etc. Use a modern operating system and enabling the security functionality all ready there and making sure that executables cant be executed from temporary directory etc. When you got basic security features in place (Including Anti-Virus), you should start looking at centralized logging and alerting on suspicious activities from the logs.
You should also look into implementing different ways of monitoring anomalies for the users usage. When do they normally log on? From where do they normally log on? Are they fetching lots of documents from the file servers? etc. And did they access the fake “secret document” that is there just for catching any suspicious activity? (You need to define your own anomalies).

When the inner core (most valued assets + its users) are “secured”, you should strive to maintain an acceptable level of security on the rest of the corporate office network and also importantly the public facing part. Compromises here can be used to escalate into the “inner core” or to damage your reputation and business affairs, so keeping an acceptable level of security here “as always” is good.

As “Prevention will eventually fail!”, you need to have sufficient logging up and running. So when you do have an incident, the analyst has sufficient data to work with and this will also keep the cost down, as the time it takes to handle an incident will be lower. I’m mostly into Network Security Monitoring, so for me, NetFlow type data, IDS events, full packet capture, proxy logs, and DNS queries logs are some key logs from network that will help me. On the more host side of logging, the more logging, the better… web, email, proxy, spam, anti-virus, file-access, local client logs, syslogs/eventlogs, and so on…..

And remember – if you cant spot any badness, you are not looking hard enough :)
I always work on the theory that something in my networks are p0wned. That keeps me on my toes and keeps me actively finding new ways to spot badness.

With that – I wish you all a hacky new year!

by Edward Bjarte Fjellskål at Fri 30 Dec 2011, 14:15

29 December 2011

Lars Strand

Terminal tip: Pipe Viewer

A couple of weeks ago, I held a Linux/Unix elementary course. One of the toughest concepts in that course are the concept of pipes and redirect.

I usually begin explaining pipe as "the output of one command becomes input to the next", and show by an example:

 $ zcat pureftpd.log.gz | cut -f1 -d' ' | sort | uniq | wc -l
 1259073

This command reads a ~550MB large compressed pureftpd logfile (from ftp.uio.no), and finds the number of unique visitors. Several commands are linked together by pipe, so the output of one command is input to the next.

However, I received and interesting question: "Which command use the longest time?"

There is no easy way to tell, we can just take an educated guess. However, we can use a handy little unix utility called "Pipe Viewer" to monitor and measure the data going through a pipe. Install from apt:

  $ sudo apt-get install pv

Next, we craft our command above using pv. Since pv behave like cat with respect to input/output, we measure the throughput between each command:

  $ zcat pureftpd.log.gz | pv -cN zcat | cut -f1 -d' ' | \
  > pv -cN cut | sort | pv -cN sort | uniq | pv -cN uniq | \
  > wc -l


As we see from the command, the command that had the slowest throughput was "uniq". Both cut and sort had an impressive 6-7MB/s throughput.

by Lars Strand (noreply@blogger.com) at Thu 29 Dec 2011, 13:10

21 December 2011

Magnus Hagander

www.postgresql.org - brand new, yet old and familiar

Most of the visitors to www.postgresql.org probably never noticed that a couple of weeks back, the entire site was replaced with a new one. In fact, we didn't just change the website - just days before, we made large changes to our ftp network as well (more about that in another post, from me or others). So in fact, we hope that most people didn't notice. The changes were mainly a technical refresh, and there hasn't been much change to the contents at all yet. We did sneak in a few content changes as well, that have been requested for a while, so I'm going to start with listing those:

  • The developer version of the documentation (updated serveral times per day from the tip of the HEAD branch that will eventually become the next version of PostgreSQL) now live on the main website, and will use the same stylesheets to look a lot nicer than before.
  • Anybody who submits content to our site (news, events, professional services, products, etc) will notice there is now a new concept of an Organisation. This means that it will finally be possible to have more than one person manage the submissions from a single company or group.
  • Again for those that submit content, it is now possible to view which of your submissions are still in the moderation queue, and it's also possible to edit something after it's been submitted. In fact, you can edit your items even after they've been approved. Any such editing will be post-moderated, and if this is abused that organization will be banned from post-moderation - but we don't expect that to ever be necessary.
  • And finally, for those that submit content again, we've switched to markdown to format your submissions, instead of a very random subset of allowed HTML tags.
The rest of the changes are under the hood, and it's mostly done for two reasons:
  • The technology powering the site was simply very old
  • The frameworks used were quite obscure, which severely limited the number of people who could (or wanted to) work with them

Hopefully these two changes will make it easier to contribute to the website, so if you're potentially interested in doing that, please read on!


Continue reading "www.postgresql.org - brand new, yet old and familiar"

by Magnus Hagander at Wed 21 Dec 2011, 13:33

18 December 2011

Faggruppe PHP

Fake php time on your ubuntu server

Some time it is necessary to fool your PHP application to test functionality related to a “current date”.

Fooling php cli is simple using faketime:

Say you have this PHP script:

#!/usr/bin/env php
<?php
print date( 'F j. Y [H:i]' )."\n";

Rendering this script with php through faketime will give you results like this:

$ faketime 'last Friday 5 pm' ./time.php
December 9. 2011 [17:00]

The same can be achieved using the datefudge command:

$ datefudge "2007-04-01 10:23" ./time.php
April 1. 2007 [10:23]

Faking time for a webapplication

Faking time for a web application is not that simple since apache will fork a process for each request and thus create a new php processes.
In this post I will show you how to use FakeTime Preload Library to fake time system wide while running tests on a web application.

First I need to install faketimelib as described on the librarys homepage.
In short:

wget http://www.code-wizards.com/projects/libfaketime/libfaketime-0.8.1.tar.gz
tar -xvzf libfaketime-0.8.1.tar.gz
cd libfaketime-0.8.1
vim README
make
sudo make install
export LD_PRELOAD=/usr/local/lib/faketime/libfaketime.so.1

For the demonstration I will use a php script like this:

<?php
printf( "The current time of the server is: %s\n", date('l F j. Y [H:i:s]') );

Running this script should yield something like:

$ php time.php
The current time of the server is: Friday December 16. 2011 [08:02:43]

Now I create a file in my home folder, faketimerc, with a new future time. I will use this file in different ways to show how I can manipulate the time.

echo "@2012-12-21 12:12:12" > faketimerc

If you now create an environment variable, FAKETIME, give it a future time, and running the same script would yield something like this:

$ php time.php
The current time of the server is: Friday December 16. 2011 [08:19:39]
$ export FAKETIME=$(cat faketimerc)
$ php time.php
The current time of the server is: Friday December 21. 2012 [12:12:12]
$ unset FAKETIME
$ php time.php
The current time of the server is: Friday December 16. 2011 [08:19:56]

As you can see, the processes you run while the environment variable is set to a future time will get a fake time. Once the variable is unset new processes will get normal time.

This can be achieved also by creating a file. “.faketimerc” in your home folder:

$ php time.php
The current time of the server is: Friday December 16. 2011 [08:23:12]
$ cp faketimerc .faketimerc
$ php time.php
The current time of the server is: Friday December 21. 2012 [12:12:12]
$ rm -f .faketimerc
$ php time.php
The current time of the server is: Friday December 16. 2011 [08:23:36]

If you want to change time for a php application that runs though apache you may want to set the fake time system wide so that the php processes spowned use the faked time. To do this you need to create a file, /etc/.faketimerc, just the same as the one I created in my home folder.

I will use w3m for this demonstration. I assume a normal ubuntu server with apache2 installed.

sudo cp time.php /var/www
sudo cp faketimerc /etc/
$ sudo /etc/init.d/apache2 restart
 * Restarting web server apache2                                                                                                                                                                                                                                                   ... waiting
$ php /var/www/time.php
The current time of the server is: Friday December 21. 2012 [12:12:12]
$ date
fr. 21. des. 12:12:12 +0100 2012
$ w3m -dump localhost/time.php
The current time of the server is: Friday December 16. 2011 [11:16:24]

As you can see, the date command and when rendering the script with php, gives me the fake time, but when rendering the script through apache I loose the fake time. This is because the environment variable LD_PRELOAD is not set for the apache process.

To fix this I need to set LD_PRELOAD for apache by editing /etc/apache/envvars. Lets test it again:

$ sudo bash -c "echo \"export LD_PRELOAD='/usr/local/lib/faketime/libfaketime.so.1'\" >> /etc/apache2/envvars"
$ sudo /etc/init.d/apache2 restart
 * Restarting web server apache2                                                                                                                                                                                                                                                   ... waiting
$ w3m -dump localhost/time.php
The current time of the server is: Friday December 21. 2012 [12:12:16]

Fake time! :)

PS! If you also want your PostgreSQL server to use fake time:

$ sudo bash -c "echo \"LD_PRELOAD='/usr/local/lib/faketime/libfaketime.so.1'\" >> /etc/postgresql/8.4/main/environment"
$ sudo /etc/init.d/postgresql-8.4 restart

by strind at Sun 18 Dec 2011, 11:47

14 December 2011

Ingvar Hagelund

Unquiet Ubuntu’s grub

When working with servers or debugging a workstation, or even just out of curiosity, it’s geek friendly to make your linux kernel boot rather more than less verbose. In Ubuntu 8.04.4 LTS, and probably other Debian derivates as well, the default is to be rather quiet, and it’s a bit difficult to find how to make it verbose, without hard coding changes to /boot/grub/menu.lst. Such changes will be overwritten by update-grub, so that’s probably a bad idea.

This fixes it:

# echo 'supports_quiet=false' >> /etc/default/grub
# update-grub

by ingvar at Wed 14 Dec 2011, 07:47

09 December 2011

Ingvar Hagelund

Setting an address on the HP iLO from Linux

So, we put this nice DL360 G7 in production, and found that networking on the iLO (integrated lights-out management) was not configured correctly. Now, the box was already running software, so it was a bit unpopluar to reboot it just to get iLO access again. Just for fun (sorry, I had not high hopes on their Linux support), I called HP support. They stated of course that this was not possible without rebooting the server and access the iLO setup through its BIOS.

Now, the HP iLO 3 should support IPMI, so after a bit of fiddling around, I came up with this, and it actually works. The following was executed on RHEL5.

First find the LAN channel

# for i in `seq 1 14`; do ipmitool lan print $i 2>/dev/null | grep -q ^Set && echo Channel $i; done

Channel 2

So, on this system, channel 2 is the LAN channel.

# ipmitool lan print 2

Set in Progress         : Set Complete
Auth Type Support       :
IP Address Source       : DHCP Address
IP Address              : 0.0.0.0
Subnet Mask             : 0.0.0.0
MAC Address             : c0:ff:ee:c0:ff:ee
BMC ARP Control         : ARP Responses Enabled, Gratuitous ARP Disabled
Default Gateway IP      : 0.0.0.0
802.1q VLAN ID          : Disabled
802.1q VLAN Priority    : 0
Cipher Suite Priv Max   : Not Available

Okay, so if you have a DHCP server on your management network, you may be content with this, and just give it an address by DHCP. I wanted to set a static address, though.

# ipmitool lan set 2 ipsrc static
# ipmitool lan set 2 ipaddr 192.168.42.36
# ipmitool lan set 2 netmask 255.255.255.0
# ipmitool lan set 2 defgw ipaddr 192.168.42.1
# ipmitool mc reset cold

That’s it actually. Exchange the LAN channel and network addresses with your own, of course.

by ingvar at Fri 09 Dec 2011, 16:27

08 December 2011

Edward Bjarte Fjellskål

PassiveDNS update (v0.2.4)

It has been some while since I had time to code on my C projects. But the last week I got some time and used it to get PassiveDNS into a state where Im more relaxed about it. Previous version (V0.1.1) used to spit out all DNS data it saw. The latest version caches DNS data internally in memory and only prints out a DNS record when it sees if for the first time, or if it is a active domain, it prints it out again after 24 hours and so on (once a day). The previous version would give me Gigabytes of DNS data daily in my test setup, while this version gives me about 2 Megabytes. This version also just gives you A, AAAA, PTR and CNAME records at the moment. I’m open for suggestions for more (use-cases would be great too!).

In my tests and in feedback from people who has tried it, PassiveDNS is very resource friendly when it comes to CPU usage (more or less idling). In current version (v0.2.4) there is not implemented any limitation on memory usage, so if your network sees a lot of DNS traffic, you might end up using some hundreds of Megabytes RAM for the internal cache. The most I’ve seen is around 100 MB at the moment. My plan is to implement some sort of “soft-limit” on memory usage, so that you can specify how much memory PassiveDNS should maximum use. The “downside” of this though, is that PassiveDNS would have to expire domains from its cache faster. That might end up in bigger log files with duplicate entries. When I say “downside”, its not a real downside as I see it. From my tests with the example scripts pdns2db.pl and search-pdns.pl, it is not much of a problem keeping up with insertions to the DB (MySQL) and your last seen timestamp will be a bit more accurate. I guess this kind of data though, is better suited for a NoSQL solution, if you are collecting lots of it.

If you have read this, and you are into Network Security Monitoring, and you don’t use passive DNS in your work, I recommend you too Google it and read a bit about it.

by Edward Bjarte Fjellskål at Thu 08 Dec 2011, 20:27

05 December 2011

Kacper Wysocki

CPM 0.26 the Console Password Manager

Some of you might have noticed that I’ve adopted this little program while its original author is MIA, and that my efforts have resulted in its inclusion into debian wheezy earlier this year.

This is great news and makes it a breeze to get up and running with CPM with a simple apt-get install cpm

However, it seems that most people are interested in running CPM on older distributions, realistically the stable distribution codenamed squeeze is a favorite, as well as the Ubuntu LTS release 10.4 codenamed lucid lynx.

So I have built some updated packages of CPM for these oldies but goodies:
* CPM for squeeze i386
* CPM for squeeze amd64
* CPM for lucid i386
* CPM for lucid amd64

Remember to install the dependencies though. On squeeze, they are:

me@mine:~# apt-get install \
    libcdk5 libcrack2 libdotconf1.0 libgpg-error0 \
    libgpgme11 libxml2 libxml2-utils libpth20

File us a ticket if you run into trouble with these packages or need cpm working on some other distribution.

CPM is a simple, paranoid password manager for the console with some cool features that make it stand out:

* data files can be encrypted for more than one person
* data files are signed by the last person who saved it so forging data files is not possible
* data files are en- and decryptable directly by gpg and gzip
* the application memory is protected from paging, core dumps, ptrace attacks and runtime environment
* data is validated using an internal DTD
* several passwords per account are possible to store
* it’s possible to handle several data files, each encrypted for different people
* cracklib checks of password strength and warnings about weak passwords
* user definable hierarchy with unlimited depth
* long comments for any node in the hierarchy
* password generator
* only one password visible at a time
* searchable database from the command line
* user definable search patterns (e.g. user@hostname)
* several hits can be displayed at once (e.g. several accounts per host)
* conversion scripts for Password Management System (pms), Password Safe and CSV files

by kacper at Mon 05 Dec 2011, 16:12

03 December 2011

Ingvar Hagelund

Gingerbread AT-AT Walker

Two years ago, we built the Gingerbread Millennium Falcon. Last year, we built the Gingerbread Vader’s Tie Fighter. What should we build this year? More Star Wars vehicles? After a bit heavy thinking, we came up with the emperial AT-AT Walker. Luckily, the web is full of pictures and sketches, so we found plenty of inspiration.

After four hours, the result exceeded all expectations! Happy advent everybody!

Update: The Friendly Fredrik posted some more pictures of the building process.

Parts ready for the oven

AT-AT Walker #7

That's a lot of parts

Assembly of the AT-AT Walker

Assembly by melted sugar. Hot stuff!

AT-AT Walker #1

AT-AT Walker #2

AT-AT Walker #3

AT-AT Walker #4

AT-AT Walker #5

AT-AT Walker #6

by ingvar at Sat 03 Dec 2011, 19:21

02 December 2011

Ingvar Hagelund

@ingvarha – that’s me

So, I finally thought I should try this social media thingie, and got myself a twitter account. Who should have thought, after all these years…

So if you wonder, @ingvarha – that’s me.

by ingvar at Fri 02 Dec 2011, 09:38

25 November 2011

Lars Strand

Security architectures in telephony systems

As tradition dictates, before I could defend my Ph.D. dissertation 22th November, I had to give a 45 minutes trial lecture. I was given only the title, and had 14 days to prepare. My title was:

  "The development of security architecture in fixed and mobile telephone systems"

One of the toughest tasks was to interpret the title and limit the scope of the lecture. I discussed with my supervisors and co-researchers and received several tips and relevant references. Then started two intense weeks with study and preparation.

I was satisfied with the disposition and result, and felt comfortable presenting the lecture.

For those interested, the slides can be downloaded here

by Lars Strand (noreply@blogger.com) at Fri 25 Nov 2011, 01:51

14 November 2011

Simeon Simeonovface = tux-default.png

Working with database schemas in SQLAlchemy

Database schema objects are often neglected when designing systems and applications. One of the main reasons may be the developers' poor understanding of the advantages that come along when using schemas.

A database contains one or more named schemas, which in turn contain tables.

Unlike databases, schemas are not rigidly separated: a user can access objects in any of the schemas in the database he is connected to, if he has privileges to do so.

Imagine that you are building a system that consists of two logical parts: a book archive and a back office authentication.
If both of them require table named 'users', there is a potential problem with name-collision.

Instead of using "smart" / namespaced table names "ba_users, bo_users", a more elegant solution is to isolate the two logical parts of the system - each in its own schema within the same database.

Schemas are analogous to directories at the operating system level, except that schemas cannot be nested.

The obvious advantages are:
... more

by Simeon Simeonov at Mon 14 Nov 2011, 08:18

13 November 2011

Sigvard Lyth

Juleønsker

amazon kindle med en slik eller en
Nook, eller cybook

disse fyller samme funksjon.

på dem vil vi gjerne ha masse barnebøker og eventyr sånn at vi kan lese samen (jeg og mathilda.)

Asus transformer 2 pad, evt, kindle fire

Til jul ønsker vi oss også bøker:

Vi har bestemt oss for å samle på Roald Dahl og Astrid Lindgren bøker. Alle sammen.
 her er en samling

Disse er jo på engelsk, men pappa finner ikke så mye på norsk.dere kan jo ta en titt på http://www.digitalbok.no/

Ganske triste saker dette med norsk forlagsbransje og digitalbøker... Men vi tar også imot fysiske bøker ;)

Det er alltid plass til mere duplo brikker

Bok til Pappa: My Friend the Mercenary, av James Brabazon








by Sigvard (noreply@blogger.com) at Sun 13 Nov 2011, 07:24

06 November 2011

Magnus Hagander

PGConf.EU 2011 - the speakers and the presentations

This part of the feedback is almost turning into a repost year from year. But it's a good thing to be reposting if any, so I'm doing it anyway. To start with, just take a look at these graphs:

i-Bs8mPP7.png i-LjQPx85.png

Those are pretty fantastic ratings. A full 84% rated the content quality as 4 or 5, and only 1% rated it as less than 3. That basically comes down to there being no talks of bad quality. This confirms the feeling that we had when we tried to pick out the talks for this year - the number of great submissions where just huge. We had to reject around half the talks submitted, and there were only a few of those that we rejected because we thought they weren't very good. Most were simply rejected because we didn't have the time and space to accept them all.

The ratings people have given our speakers confirm what we have always thought to be one of the reasons people like the conference - and many other PostgreSQL conferences as well: you get to listen to and talk to the people who really know what they are talking about. Often because they are the very people who wrote the software in question. A whole 96% of all the ratings gave our speakers a score of 4 or 5 for their knowledge of the topic. And nobody scored lower than 3. These truly are the experts you get to meet!

Most of our speakers also scored very high on the Speaker Quality metric. Our top speakers this year were:

Speaker Rating Vote count Standard deviation
Bruce Momjian 4.8 31 0.4
Ram Mohan 4.7 36 0.5
Selena Deckelmann 4.7 38 0.5
Magnus Hagander 4.6 52 0.6
Simon Riggs 4.6 43 0.6
Stephen Frost 4.6 18 0.5
Peter van Hardenberg 4.5 11 0.7
Gavin M. Roy 4.5 10 0.5
Greg Smith 4.5 68 0.7
Harald Armin Massa 4.4 10 0.5
Steve Singer 4.4 10 0.7
Gianni Ciolli 4.4 32 0.8
Dave Page 4.3 25 0.8
Heikki Linnakangas 4.3 12 0.9
Ed Boyajian 4.2 13 1.0
Marc Balmer 4.1 12 0.7
Dimitri Fontaine 4 11 0.8

This really is the reason why people come to the conference, and keep coming back the next year - our outstanding speakers! Thank you all for showing up this year to give your presentations, and we hope to see you again next year!

That concludes the posts I'm going to make about pgconf.eu feedback this year. Some of you have already asked about next year, and I'm not going to post any information about the feedback we got there - yet. We are reviewing the feedback we received, and are soon going to start looking for a good venue for next year. We have made the mistake before of announcing a location before we had a venue secured, and we're not going to do that again. We are going to announce it as soon as we know, but that will not be until we have actually decided on an exact venue. But we are absolutely planning to do it again next year, and sometime around the same time of the year. Exactly where we don't know yet...

by Magnus Hagander at Sun 06 Nov 2011, 16:36

04 November 2011

Magnus Hagander

PGConf.EU 2011 - the feedback is in

Almost exactly a week later than what we said, I have finally closed down the feedback system for PostgreSQL Conference Europe 2011. I think we all needed slightly more time than we expected to recover and catch up properly...

The detailed feedback for each speaker will be sent out during the day today, unless we run into any unforeseen technical issues, and I will try to summarize the conference-wide feedback here. If any particular note that you posted is not referred here, don't worry - we read them all, but there are far too many of them to post here.

Starting with the conference organization itself and it's venue, I'm really happy to see that we have managed to deliver something that the majority of our attendees really like:

i-kpsz6c3.png i-N5rCKq7.png

Not a single vote less than 4, on a scale of 1-5, for the overall impression. And only one below 4 for the programme. I can only say a huge thanks to the big group of volunteers who ran this conference, and made it what it was. Clearly you did a good job!


Continue reading "PGConf.EU 2011 - the feedback is in"

by Magnus Hagander at Fri 04 Nov 2011, 09:53

01 November 2011

Faggruppe PHP

Practising the fundamentals: Function of the week/month

When you need to see if any of the chars in $chars is in another string.. Whats the simplest way to search for them?
Hopefully none of the libraries have such functionality, so you need to go “a bit lower”.
The answer is, ofcourse, trivial:

$chars = "xy";
$string = "the quick brown fox jumps over the lazy dog";

$found = false;
for($l = 0; $l < strlen($chars); $l++) {
        if (strpos($string, $chars[$l]) !== false) {
                $found = true;
                break;
        }
}
if ($found) {
        /* Do stuff */
}

But hang on. This feels a bit weird. Surely PHP must have a better way of doing this?
Browsing through the string section in the PHP manual you’ll notice PHP has bucketloads of native string functions. If you have a background from other languages, you could even just try and see if PHP has a function of the same name (which quite often it does) that solves the problem.

And sure enough, it does: strpbrk!

$chars = "xy";
$string = "the quick brown fox jumps over the lazy dog";
if (strpbrk($string, $chars)) {
        /* do stuff */
}

Just give the “problem” a second thought before going crazy with your coding, and keep in mind you aren’t just working with Symfony, Drupal, SugarCRM, WordPress, … Your good old pal, PHP, is there too.

by bjori at Tue 01 Nov 2011, 19:35

22 October 2011

Faggruppe PHP

Responsive Web Design

Which resolution web application should be optimized to, is a question with different answer mainly depended on time when asked. The average screen size and resolution have been growing as a result of technology evolution and falling prices of electronic equipment.
Resolution usage, Source: http://www.w3counter.com/
Resolution usage, Source: http://www.w3counter.com/

At the same time mobile devices, like pads and cell phones, with small screens and low resolution, became capable to browse Internet in a more effective way.
Mobile browsing, Source: http://gs.statcounter.com
Mobile browsing, Source: http://gs.statcounter.com

So, what to do if I have in my requirements support for IPhone 3GS 3,5” 320×480 and PC screen with high resolution 1920×1080. Somehow I must optimize my application to these two resolutions.

One of the common solutions is to split the application on view level by implementing two versions of templates. One optimized for high and the other for low resolution devices. It is going to work, but is that really a good solution from design point of view. The main disadvantage is that the application is still not supporting all possible resolutions, is just optimized to high and low one. And of course another issue is maintainability and future compatibility.

What if instead of optimizing application to a specific resolution, I could support all of them. It is actually not a new idea. It has been ages since I could use size=”100%” attribute in some html tags. By doing this html element is always scaled to the maximum available size. It gives some flexibility. However scaling has limitations, sooner or later I will reach a stress point where scaling is not effective anymore and I would need to change layout by relocation some elements of interface.

So, to optimize web application to any resolution I would need to:

  • scale html objects
  • dynamically change layout
  • scale images

Generally speaking, I need an ability to alter my application, in order to continually reflect the environmental conditions.

This is what “Responsive Web Design” stands for.

In practice responsive web design is an intelligent use of flexible grids, layouts, images and CSS media queries.

Media Queries

With media queries I’m able to resolve the first two first problems, I can scale html elements and change layout.

Since CSS 2.1 there is a possibility to define custom style sheets for different media types.

@media print {
/* style sheet for print goes here */
}

@media screen {
/* style sheet for screen goes here */
}

CSS 3 offers an extension called media query, which allows to specify conditions when the specified style sheet will affect user interface.

@media screen and (max-width: 640px) {
/* Window size < 640px */
}

@media screen and (max-width: 800px) and (min-width: 640px) {
/* Window size between 640px and 800px */
}

@media screen and (max-width: 1024px) and (min-width: 800px), (max-width: 640px){
/* Window size between 1024px and 800px or less than 640px */
}

Here is an example of valid media query definitions:

<link rel="stylesheet" type="text/css" media="screen and (max-width: 640px)" href="shetland.css" />

@media screen and (max-width: 640px) {
/* Window size < 640px */
}

@import url("style.css") screen and (max-width: 640px);

If a web browser does not support media queries css is loaded always without any conditions, this is unwanted behavior. If you want to prevent loading css with media queries on not supported web browsers you can add word “only” before media type:

/* add word «only» to be ignored on web browsers with out support */
@media only screen and (max-width: 640px) {
/* Window size < 640px */
}

There is quite many criterias available in CSS3, however the two first on the below list are the most usable.

  • max-width / min-width
  • max-device-width / min-device-width
  • orientation (portrait/landscape)
  • device-aspect-ratio
  • min-resolution / max-resolution
  • monochrome
  • Min-color-index

See http://www.w3.org/TR/css3-mediaqueries for more

CSS Media queries are supported by the following web browsers:

  • Firefox 3.5+
  • Chrome
  • Safari
  • Opera 9.5+
  • Opera Mini
  • Android Browser
  • Opera Mobile
  • IE9+

CSS media queries are quite useful if the goal is to affecting mobile devices, it is supported by most of the web browsers used on mobile devices.
But if the goal is to support older web browsers Java Script comes with help.

Here is an example of loadCss() and removeCss() methods that can be used to dynamically load and remove css files.


/**
* Load CSS file
*/
function loadCss(filename){
var links = document.getElementsByTagName("link");
for (var i=0; i < links.length; i++) {
if(links[i].getAttribute("href") == filename) return;
}
var fr=document.createElement("link")
fr.setAttribute("rel", "stylesheet")
fr.setAttribute("type", "text/css")
fr.setAttribute("id", filename)
fr.setAttribute("href", filename)
document.getElementsByTagName("head")[0].appendChild(fr)
}

/**
* Remove CSS file
*/
function removeCss(filename) {
var links = document.getElementsByTagName("link");
var parent = document.getElementsByTagName("head")[0];
for (var i=0; i < links.length; i++) {
if(links[i].getAttribute("href") == filename) {
parent.removeChild(links[i]);
}
}
}

 

Adding below code to onResize event will be equal to: @import url(“mini.css”) screen and (max-width: 400px);

/* Activate mini.css file if window size is less than 400px */
if(windowWidth<400) {
loadCss('mini.css');
} else {
removeCss('mini.css');
}

The above example should in theory by compatible with old web browsers (including IE6) .

Images

The last of my needs is to scale images, here are few ways how I can do it.

Fluid images

When user opens my application on a mobile device with small screen and low resolution or changes web browser window size, it might happen that my images will reach a stress point where they will consume more space that the window is able to offer. To avoid this problem I need to scale them proportionally.

Fortunately there is a CSS property called max-width, if I set to 100% in theory I get all I need. My images will by default be displayed in their original size and will be proportionally scaled when needed.

But in practice this solution is affected by few problems, the main is that I will need to load oversized images in low resolution which can be a problem for small mobile devices. The other problem is a poor quality of scaled images on IE.

Hiding images

Another possibility is to have alternative versions of an image, depending on the environment images can be visible or hidden, just like on the example below:

.small-image {
display:none;
}

@media only screen and (max-width: 600px) {
.default-image { display:none; }
.small-image { display:inline; }
}

<img class="default-image" src="img1.jpg">
<img class="small-image" src="img2.jpg">

 

It works quite well, if windows size is below 600px (this is the stress point) small-image gets attribute display:inline; and default-image gets display:none;
Nice solution, compatible with all common web browsers, however IE always loads both images even if one of them will never be showed and it is not quite clean code.

“Content” attribute

It is possible to show images with combination of content property and url value, see example below:
@media screen and (max-width: 600px) {
.image1:before {
content:url(img2.jpg);
}
}
@media screen and (min-width: 600px){
.image1:before {
content:url(img1.jpg);
}
}
<span class="image1" />


But still, this code isn’t clean and is not compatible with some common web browsers.

Java Script way

JS come here with help as well.

One of the possibilities is to set a cookie with screen size information.
document.cookie = "screenWidth=" + screen.width;
and then serve all images through media content, which will send the right image version depending on screenWidth value:
<img src="media/?test.jpg">

CSS 3 way

In the future I’m expecting below code to be supported in all CSS3 compatible web browsers, for now only Opera supports this syntax partially.

<img src="test.jpg"
img-400px="test-400px.jpg"
>
@media (min-device-width:400px) {
img[img-400px] {
content: url(attr(img-400px));
}
}

by wojak at Sat 22 Oct 2011, 18:47

20 October 2011

Faggruppe PHP

Practising the fundamentals: Staying in touch

With the recent massive flood of frameworks, libraries and toolkits on the market these days it is easy to forget that underneath it all is the good old, plain and simple, PHP with all its kinks, quirks, and huuge set of builtin functionality.

PHP has vast amount of extensions which solve all sort of problems. And if PHP doesn’t have it built-in, we have an impressive amount of additional extensions both on pecl and now recently more and more on github.
There is a high chance that someone else has been in your shoes already and solved the problem, so it is worth looking around over the horizon and see if the problem has been solved already.

For some reason the current practice seems to be the “RoR” idiocy where “RoR developers” barely even know that there is this Ruby some miles down the stack. PHP has hit this “stepping stone” already with WordPress, Drupal and even Symfony and that is a weird and scary thought. Remembering “where you came from” is an important fact to remember, even for those who specialize in specific products. Looking at how other projects work, comparing notes, work ethics, features and functionality is also very important. Getting different perspective and knowledge is how we can improve our solutions and work more efficiently. If your specific product doesn’t have native support for something, why not look at a different framework/library/cms/toolkit/.. even PHP extensions?

As June mentioned earlier, going ‘back home’ and checkout the PHP manual pages is generally a good idea. Things change, manual pages are updated, improved, added, and you have different perspective, other problems to solve and so on. Even though you believe you know all the basics, you still need to practice them, and that includes browsing the manual from time to time, again and again – no matter which project it is.

So what is the best way to stay in touch? Kept up2date with new ways and offerings? New solutions to the same problem? Get involved!

By far the best way is to get involved with the project you are using. Even just silently idling on the mailinglists and read the subjects. Subscribing to the commit lists is a fantastic way to see precisly what is going on and see which direction the project is taking. Who knows, after a while you may spot something the others didn’t. Get an idea for a killer feature. Shed a light on different perspective the others didn’t think of. After a while hanging on the lists you’ll get a feeling for how the project works, and hopefully start chiming in. Give your 2cents, and who knows – even cook up a patch or two.

by bjori at Thu 20 Oct 2011, 15:37

17 October 2011

Erik Inge Bolsø

Terra firma

Ok.

Jeg har i dag oppdatert firmware - på ei mus.

Når ble de lure nok til å trenge firmware? Antagelig sammen med led/optikk-variantene.

Og hva hindrer dem fra å gro et virtuelt keyboard om natta og bedrive bruteforcing av passordet mitt? Ingenting jeg vet om.

(Ok, feedback er et problem. Men det fins usb skjermkort... :p)

by Erik Inge Bolsø (noreply@blogger.com) at Mon 17 Oct 2011, 19:14

07 October 2011

Faggruppe PHP

Optimising a PHP (symfony 1.x/Doctrine 1.x) application

Introduction

Whilst everyone is buzzing and creating fancy new Symfony2/Doctrine 2 applications, and perhaps even a shift to new frameworks/no frameworks, a great deal of us are still maintaining legacy apps and will be for some time to come. As these apps grow, we occasionally need to look back and scream at our old code and wonder why we didn’t make it more scalable or use neat optimisation tricks back when it was first conceived. The fact is, many of these “tricks” are not necessary at the time for a virgin app, and we need to develop code that is relevant to the task at hand.

That said, being aware of some of the case studies I will present to you now may help you to optimise old code, but may also allow you to think twice when you are working with new code – as the things I will describe do not take too much time to implement first time round.

Case study: Working with large resultsets and array_merge

In my sample application, the database contained many different types of organisation stored in different tables, and due to the structure of the data and the criteria for each organisation, there was no easy way to retrieve several organisations at once using pure SQL and joins (well, no convenient way). The solution is relatively simple, one query per organisation then combine the results, in this case as we go along:

$membershipClasses = array("Entity1", "Entity2", "Entity3", "Entity4", "Entity5");
    $results = array();
    foreach ($membershipClasses as $joinedTable)
    {
      $query = $this->createQuery()->from("TablePrefix".$joinedTable." t");
      // Lots of differing criteria based on which class we were dealing with
      ...

      $results = array_merge($query->execute(array(), $hydration), $results);
    }

This logic was used to generate a report with approximately 50,000 rows, used 1.2GB of internal memory and took around 20 minutes to complete – often failing or bringing the system to a halt.

One optimisation used in this case (many more are surely possible but lets focus on one) involves the last line – using PHP’s array_merge function. Simply switching this out to use a simple array traversal brought the execution time down to under 2 minutes, however there was no change in memory usage.


$theseResults = $query->execute(array(), $hydration);
foreach ($theseResults as $aResult)
{
  $results[] = $aResult;
}

Why is this so much faster? And why can’t we use += instead?

 

The “problem” with array_merge, is that even though it discards numeric keys, it still has to *check* them all first, which is a lot of overhead on 50,000 rows. We can’t use += in this case either, because that would respect the numeric keys which would start from 0 each time, and therefore the results would not “stack” as intended, unless we tell Doctrine to index the results with some unique key.

Why it takes *so* much longer is a bit of a mystery to me. I’ve looked at the c code behind it (ext/standard/array.c) and it’s, well quite frankly voodoo.

Case study: Limiting results

Something that was “fixed” (read: removed) in Doctrine 2 was the ability to limit results returned from queries, at least the “magic” part of it. The problem with limiting hydrated results is that you need to know exactly how many rows each sub-tree will contain before you can limit the query as a whole. Imagine a person with many addresses, and you want an array limited to 5 people – you can’t list add “LIMIT 5″ to the end of the query, because when this is hydrated you will most likely end up with one or 2 people, the second of whom may not have all their addresses, because you’ve told your database manager to return 5 *rows*, it has no idea how these rows relate to your model.

In Doctrine 1, adding a limit clause would cause all sorts of magic to happen, the resulting query ending up as a collection of complex subqueries each with their own limit clause, the more joins you introduced, and the more levels of join, the more complicated it becomes. First level joining is not so bad, but joining several levels deep soon starts to get heavy, and your execution speed will suffer for it. Couple this with a large resultset and you will see the smoke drifting from the server room in no time.

So, what’s the solution? Well, this one is not so clear cut – you have to experiment. Sometimes, doing it the “magic” way will work just fine, and is totally acceptable, but when the query becomes “heavy” you have a few options:

Work out your subset first

This is the default Doctrine 2 approach, it’s quite simple – you use one query to get all the ids of the subset of top level elements, then pass these IDs to another query in a “WHERE IN” clause. When you look at the resulting SQL, it can be quite scary, especially if you are dealing with many results – you can easily be saying “WHERE IN (1,2,3,4…..9999999999)”. Most dbms will handle this surprisingly well, as long as you are using the primary keys, so experiment with it and it might be the way to go.


if ($limit)
{
  $subQuery = $this->createQuery("foo")
                   ->select("foo.id")
                   -> //
                   ->limit(50);
  $ids      = $subQuery->execute(array(), DOCTRINE_CORE::HYDRATE_SINGLE_SCALAR);
  $query->andWhereIn("cf.id", $ids);
}

Note the use of “single scalar hydration” – we have no need for more than this (to be really picky could go straight to PDO here but for consistency this is ok). In this case, the single scalar hydration will give us a resulting array of raw IDs, which is exactly what we need to pass to the main query.

Make your own subquery

A variation on the above is to embed the “ID sucking” query into your main query, some database engines may prefer this style, but I have not noticed any significant performance gain or loss doing this so prefer the above option as it’s cleaner in the PHP code.


$query->whereIn('SELECT id FROM my_other_table WHERE blah LIMIT 50');

In a perfect world, this is the kind of thing Doctrine could have done instead of the subquery magic, but there are so many factors to consider here that I can understand why they did not go down that route.

Stick with the magic, but simplify

It is also possible to continue using the magic, but greatly improve performance by simplifying. Remove all those second+ level joins and use separate queries to get hold of the data you need. How often have you added a chain of joins just to get one snippet of data from the last node? Scrap all the joins and make a new query later where you can pass in all the IDs from your main query and get the data you need. Now you don’t need a limit clause, because you are asking for exactly the data you need in the first place.

Start with the fastest hydration mode and work upwards

This one I can’t stress enough – an ORM such as Doctrine is there to give you the tools you need when you need them, but it is often the case that applications are built with all the overhead and magic just because it can be, or because it’s the default behaviour. Stop. Look at what data you *actually* need, and especially ask yourself if you need objects in your results. Perhaps you are retrieving all your users and listing them with their profile data, but you are hydrating them as objects because you need some custom functions you’ve written, classic examples are getAge() or getFullName() which are derived from other fields. If this is the only reason you are object hydrating, consider something like this:


class User
{
  function getAge()
  {
    return self::getAgeFromDOB($this["dob"]);
  }

  function getFullName()
  {
    return self::getFullNameFromUserArray($this);
  }

  public static function getAgeFromDOB($dob)
  {
    $dob = strtotime($dob);

    $year_diff  = date("Y") - date("Y", $dob);
    $month_diff = date("m") - date("m", $dob);
    $day_diff   = date("d") - date("d", $dob);

    if ($month_diff &lt; 0 || ($month_diff == 0 &amp;&amp; $day_diff &lt; 0))
    {
      $year_diff--;
    }
    return $year_diff;
  }

  public static function getFullNameFromUserArray($user)
  {
    return $user[&quot;first_name&quot;] . &quot; &quot; . $user[&quot;first_name&quot;];
  }
}

In this example, we’ve kept the sideways compatibility of the getFullName function with array/object hydration, so it will work whether an object is passed or an array (sanity checking needed of course). This method could be expanded to support more “raw” hydration methods also. Now in our templates, we can just use:

<li><?php echo $user["username"]; ?></li>
<li><?php echo $user["Addresses"][0]["city"]; ?></li>
<li><?php echo User::getFullNameFromUserArray($user); ?></li>
<li><?php echo User::getAgeFromDOB($user["dob"]); ?></li>
<li>...</li>

This can massively speed up your application if you have long lists or are generally dealing with lots of data, especially data than spans several levels.

Lets also take a second to consider the super fast hydration methods, if we are really struggling for resources (perhaps working with large downloadable reports) and we can live with slightly less friendly arrays of data, scalar hydration can save the day. Shifting to scalar hydration means we lose the ability to later instantly switch to object hydration due to the alternate syntax and lack of nesting, but if we are considering scalar hydration in the first place we are probably in a situation where object hydration will never be practical.

<li><?php echo $results["u_username"]; ?></li>
<li><?php echo $results["a_city"]; ?></li>
<li><?php echo User::getFullNameFromUserArray(array("first_name" => $results["u_first_name"], "last_name" => $["u_last_name"])); ?></li>
<li><?php echo User::getAgeFromDOB($results["u_dob"]); ?></li>
<li>...</li>

I could go on and on but I hope these points have given you some food for thought!

by Russ at Fri 07 Oct 2011, 13:35

05 October 2011

Magnus Hagander

Stockholm PUG finally off the ground

Last night, we finally got a PostgreSQL User Group in Stockholm started. We've discussed this for years, but never got around to making it actually happen. Well, with big thanks to Claes who took care of the main organization tasks, we finally did - and I'll happily declare it a big success. It was our first meeting, and we actually didn't promote it very well (so bad that at least one fairly well-connected PostgreSQL community guy didn't realize it was on until registration was already closed - I'm sure others missed it too), and we still managed to get more than 30 people there! Awesome!

Hopefully we can keep the numbers at this level. For now, we are planning to meet around once every three months or so, which means we'll be looking at the next meeting sometime in January. Exact date, and also location, yet to be decided upon.

Claes is supposed to be setting us up with a website (we have plenty of domains already...) and an associated mailinglist, and I guess a registered IRC channel as well. Hopefully soon. But given that he set us up with a room, a projector, pizza and beer last night (thanks, btw, and thanks to Glue for picking up the bill), I think we can give him a couple of hours before we start complaining...

So - see you at the next Stockholm PUG meeting!

i-D9twsjx-M.jpg

by Magnus Hagander at Wed 05 Oct 2011, 08:54

26 September 2011

Kacper Wysocki

oh noes, o cert my *sniff* cert

papieren bitteI’m not going to tell you about DigiNotar, whose file of bankruptcy this month held shock for no one after recently having lost the keys to the grand vault, in which the government held much stock. Though I have many comments upon the sophistication of the player that so thoroughly owned the most trusted agencies of the digital age….

The cracker hardly needed them skillz, considering it has been a challenge to keep that whole corrupt industry accountable. The trouble with the central authority system is that even if only one of the keys is compromised, the system is broken and gives no assurances whatsoever. No warning bells either. Just a sweet silent man in the middle, passing along all the best parts to his lover.

It’s not a joke for the 300,000+ people who documentedly had their emails and facepalms compromised. We thought he was kind to give an interview and we wait in awe for his next move.

I’m not going to mention the fatal flaws in certificate revocation that became embarrassingly apparent when the damage was done.
What’s hardly the matter since this kind of thing is bound to crop up, that hole in TLS was deemed unexploitable – now there’s a Titanic if I ever saw one. Un sinkable. Too fat to die.
cert failure

SSL is an open book for those who dare to look, and it’s got more than a couple old bugs. It’s okay though, we can patch it, they will say. Dare to look the other way!
Not that you need those anyway, since there are some really nice sslsnarfing techniques out there that entirely forgo attacks on SSL as “too inefficient”.

But I say nay! Unacceptable. There is another way.. and we’re already doing it! We sign our own signatures and we back each other’s signatures.
Now that’s business, something that the companies on your CA trusted list were painfully aware of when they laid down the law of the code and put themselves on the trust list. Yet still ca-cert is not on your trust list, and warning bells fly off on some of the most trustworthy sites- self-signed ones.

Just don’t ask them why or how, or anything that isn’t directly relevant. Do you even know what is on your trust list? You might just be surprised at what you can find.

# ls -al /etc/ssl/certs | wc -l
479

How many of these do you trust? How many of these should you trust? I’ll tell you: *none*.

We should not be adding static lists of central signing authorities to our systems. This is a brittle and dangerous system. We knew this, but hackers have now thankfully demonstrated it.
A better way is for every person (and by extension every browser) to keep their own list of signing certs, and to exchange these certs with their friends (automagically, if you like). Your friends lists can come out of a social network, any social network, and it will mean that any site that has been vetted by one or more of your friends will likely be safe for you to use as well. It’s even better than that, you can check certs from multiple friends and detect discrepancies.

green padlock
That, my friends, is called the Web of Trust, and is a design that is heading in the right direction. convergence.io is doing something similar already to a Firefox near you, while GPG has worked like this for three decades!

It has to be simple. It has to be very simple. And it has to be chemically free of one word: ‘central’.

One real easy way to do this on linux would be using git and signed manifests. I already do this in gone to assure that only files on a manifest signed by a trusted key get installed.

by kacper at Mon 26 Sep 2011, 12:26

23 September 2011

Kacper Wysocki

ip6 DNS wildcards considered harmful

I discovered something yesterday that might be of consequence:
If you have ip6 connectivity the domain name resolver will prefer an ip6 wildcard domain over a ip4 A or CNAME record. This breaks things like ssh. You’d expect the resolver to choose the response that is most specific, the same way ip4 wildcards work, and not to blindly prefer ip6 wildcards.

Consider the case of Mary, who’s been around and has lots of domains:

hail.mary.com
naked.mary.com
see.mary.com

and she’s also wildcarding all the other *.mary.com to her vanity host me.mary.com… you get the idea, it’s fairly typical. Those hosts only have ip4 connectivity. Now she adds a new address ip6.mary.com and puts a wildcard ip6 record *.mary.com, expecting that people accessing foo.mary.com on ip6 get the wildcard address – and they do! But she gets alot more than the doctor ordered, her ip6 clients will also get the ip6 wildcard address for all her other domains! hail.mary.com, naked.mary.com and see.mary.com will all land on ip6.mary.com instead of the ip4 A records. What happened here?
Effectively, Mary’s ip6 wildcard broke all ip6 to ip4 connectivity for Mary’s existing subdomains!

Yep, you can fix it on your machine, but this is a client problem and you can’t fix everybody else’s resolvers, so what you have to do is avoid ip6 wildcard domains ENTIRELY. Thanks a bunch.

On a completly different node:

“debug This option is recognized by pam_ldap but is presently ignored.”

I mean wow. What did they do, write the whole module flawlessly on the first try? I wish.

by kacper at Fri 23 Sep 2011, 09:44

19 September 2011

Magnus Hagander

pgconf.eu schedule & keynote announced

A little bit later than we hoped, we have now finally published the schedule for pgconf.eu. Three days full of presentations to choose from - and of course also the always popular lightning talk sessions. The schedule listed now is what we consider the final version, but we obviously reserve the right to make last-minute modifications both to which talks are included and exactly when they are scheduled, if necessary.

Keynote speaker

We are also happy to announce that the conference keynote will be presented by by Ram Mohan, CTO of Afilias, who will be talking about how Afailias has built their company on open source solutions, and how this has turned into a great success. Afilias as a company has been deeply involved with PostgreSQL for a long time, including employing former Core Team member Jan Wieck and leading the development of the Slony replication system.

by Magnus Hagander at Mon 19 Sep 2011, 16:57