Posted in Information Technology, News

Goodbye accountants! Startup builds AI to automate all your accounting

Smacc, which uses AI to automate accounting, has secured a $3.5 million Series A round from Cherry Ventures, Rocket Internet, Dieter von Holtzbrinck Ventures, Grazia Equity and business angels.

Smacc offers small and medium-sized enterprises a platform to digitize and automate accounting and financial processes.

The founding trio Uli Erxleben, Janosch Novak and Stefan Korsch came up with the idea after find accounting to be the most painful part of their own startup. Erxleben managed Rocket Internet’s US ventures in New York and San Francisco, and is also the founder of Berliner Berg, a craft beer startup.

Customers submit their receipts to Smacc, which turns them into a machine-readable format, encrypts them, then allocates them to an account. The platform gradually also self-learns, tracking invoices, sales and costs, as well as their liquidity.

The system checks against some 64 data points, verifies the invoice, checking, for example, that the math adds up, and even if the VAT-ID and its issuer are correct. Once the system has already learned how to deal with the supplier on position levels, it will do it automatically. Over time – says the startup – it becomes better and better at automatically dealing and allocating the data.

“Now you have all you need for liquidity planning and revenue/expense reports close to real-time in the tool w/o the need to input data yourself or wait for your external account to do it for you at month’s end,” says Erxleben.

There are other cloud-based accounting software providers out there, such as Xero and Crunch Accounting, but Smacc says what makes them different is the high level of automation of all the accounting and finance processes that companies usually input manually.

 

Goodbye accountants! Startup builds AI to automate all your accounting

Posted in Information Technology, Security

Pytbull – Intrusion Detection/Prevention System

pytbull is an Intrusion Detection/Prevention System (IDS/IPS) Testing Framework for Snort, Suricata and any IDS/IPS that generates an alert file. It can be used to test the detection and blocking capabilities of an IDS/IPS, to compare IDS/IPS, to compare configuration modifications and to check/validate configurations.

The framework is shipped with about 300 tests grouped in 11 testing modules:

  1. ** badTraffic ** : Non RFC compliant packets are sent to the server to test how packets are processed.
  2. ** bruteForce ** : tests the ability of the server to track brute force attacks (e.g. FTP). Makes use of custom rules on Snort and Suricata.
  3. ** clientSideAttacks ** : this module uses a reverse shell to provide the server with instructions to download remote malicious files. This module tests the ability of the IDS/IPS to protect against client-side attacks.
  4. ** denialOfService ** : tests the ability of the IDS/IPS to protect against DoS attempts
  5. ** evasionTechniques ** : various evasion techniques are used to check if the IDS/IPS can detect them.
  6. ** fragmentedPackets ** : various fragmented payloads are sent to server to test its ability to recompose them and detect the attacks.
  7. ** ipReputation ** : tests the ability of the server to detect traffic from/to low reputation servers.
  8. ** normalUsage ** : Payloads that correspond to a normal usage.
  9. ** pcapReplay ** : enables to replay pcap files
  10. ** shellCodes ** : send various shellcodes to the server on port 21/tcp to test the ability of the server to detect/reject shellcodes.
  11. ** testRules ** : basic rules testing. These attacks are supposed to be detected by the rules sets shipped with the IDS/IPS.

It is easily configurable and could integrate new modules in the future.

There are basically 5 types of tests:

  1. ** socket ** : open a socket on a given port and send the payloads to the remote target on that port.
  2. ** command ** : send command to the remote target with the subprocess.call() python function.
  3. ** scapy ** : send special crafted payloads based on the Scapy syntax
  4. ** client side attacks ** : use a reverse shell on the remote target and send commands to it to make them processed by the server (typically wget commands).
  5. ** pcap replay ** : enables to replay traffic based on pcap files

Architecture

Remote mode

In this mode, the IDS is plugged on the span port (or port mirroring) of the core switch and is configured in promiscuous mode. The IDS analyzes all traffic that goes through the core switch. Malicious files can be downloaded either by pytbull or by the server. This mode is called “remote”.

Local mode

In this mode, files are downloaded on the client pytbull is started from.

IDS mode with attacked server in DMZ

In this configuration, a firewall splits the network into 3 parts (lan, wan, dmz). The IDS is plugged in a span port (or port mirroring) of the switch with its interface configured in promiscuous mode. It will analyze every traffic that is sent to the LAN interface of the firewall.

IPS mode

In this configuration, a firewall splits the network into 3 parts (lan, wan, dmz). The IDS is plugged between pytbull and the firewall. To give the IDS a chance to detect the malicious files, pytbull has to download the infected files itself.

IPS mode with attacked server in DMZ

In this configuration, a firewall splits the network into 3 parts (lan, wan, dmz). The IDS is plugged between pytbull and the firewall. Malicious files have to be downloaded by pytbull directly to give the IDS a chance to detect them.

Usage

If you have selected the clientSideAttacks module (see configuration file section for more information), you will need to start the reverse shell on the server. Following command uses port 34567/tcp:

$ ./pytbull-server.py -p 34567

Since the files are downloaded in the current directory, you can create a pdf/ directory and start pytbull from the parent location:

$ mkdir pdf/ $ cd pdf/ $ ../pytbull-server.py -p 34567

Then start pytbull (on the client side). An example to start pytbull tests against 192.168.100.48, running Snort:

$ sudo ./pytbull -t 192.168.100.48

_ Notice that you will need to adapt (config.cfg) the port used by the reverse shell if you use the optional parameter -p on remote side. _

Posted in Devops, Information Technology

IT-Ops Cheat Sheet

Software Firewalls, LBs

Install Servers

  • Cobbler
  • MAAS – Ubuntu “Metal As A Service” install server
  • Foreman (integrated with puppet)
  • SpaceWalk (Kickstart based for Redhat and Solaris)
  • Razor (from Puppetlabs)

Deployment

Orchestration Tools

  • JuJu: mostly for Ubuntu, service orchestration tool (Python, commercially backed)
  • Maestro (enterprise, commercial)
  • mcollective – Puppet parallelizing and orchestration framework
  • SaltStack

Orchestration Standards

Orchestration Frameworks

Security

Performance Debugging

Monitoring

Graphing/Trending

  • rrdtool (arkane, do not use directly)
  • Cacti (arkane, do not use)
  • Munin (easy to setup, good graphs, great defaults, aged)
  • Ganglia
  • Graphite, Grafana, influxDB (good customization, aweful usability)
  • collectd

Active Service Checking

  • Nagios, Icinga
  • Nagios frameworks with Vendor Lock-in: Groundworks, OpsView, Zabbix
  • check_mk (OSS, multi-site Nagios GUI)

SNMP based

  • Naemon

Real Time Metrics

  • Single Host only

Hosting / ISP

API Documentation

Software Architecture

F5 Series on Architecture

1) Cloud/Automated Systems need an Architecture

https://devcentral.f5.com/articles/cloud-automated-systems-need-an-architecture-26975

2) The Service Model for Cloud/Automated Systems Architectures

https://devcentral.f5.com/articles/the-service-model-for-cloud-automated-systems-architectures-27129

3) The Deployment Model for Cloud/Automated Systems Architectures

https://devcentral.f5.com/articles/the-deployment-model-for-cloud-automated-systems-architectures-27228

4) The Operational Model for Cloud/Automated Systems Architectures

https://devcentral.f5.com/articles/the-operational-model-for-cloud-automated-systems-architectures-27254?sf92906191=1

People

Feature Management

Collaboration

Git Code Review Solutions

Skills:

Posted in Devops, Information Technology

Solutions Virtualization Cheat Sheet

Some simple decision matrix on Linux virtualization solutions

Virtualization for Developers

Developers use virtualization to quickly spin up systems to test new features. Either you use images build by some CI process or you ad-hoc install machines with your automation/deployment/orchestration solution. Solutions criterias:

  • Testing on laptops/desktop PCs:
    • LXC or Docker containers on average PCs
    • Vagrant+VirtualBox high-end PCs
  • Testing on self-hosted servers:
    • Docker using CI-build images
    • KVM with automation/deployment tool chain
  • Testing in the cloud:
    • Choose a cloud with good self-service
    • Self-service needs to be scriptable
    • Connect CI-build chain with self-service script

Container Virtualization

Which container solution should be used for development purposes?

Name

Pro

Contra

Pricing

  • See LXC
  • Easy image usage with many online repos
  • Widely used with Github
  • Orchestration support
  • See LXC
  • Unclear relation to LXC and systemd
  • Uses cgroups like LXC and systemd
  • Why having LXC and Docker on your laptop?
  • Heavy downloads when using online images

Free

CoreOS Rocket

  • App Container Specification
  • Theoretically Multi-OS Apps
  • Young, uncertain future
  • Not a major distro
  • Light-weight alternative to Vagrant+VirtualBox
  • No HW virtualization needed
  • No image usage
  • Bad template support debootstrap
  • Not really supported by distributions
  • No resource limitations
  • No security

Free

?

  • Less known
  • Custom kernel (before Linux 4.0)
  • Incomplete /proc

Free (Virtuozzo is enterprise)

 

Posted in Information Technology, Security

How to Detect Sniffer in Your Network

Xarp is an advanced anti-spoofing tool that flags all the spoofing attacks that might be using ARP(address resolution protocol) targeting your system. This includes documents, emails, and VoiceIP conversations.  ARP attacks allow a hacker to manipulate the data sent over the network. Xarp uses active and passive modules to detect hackers inside the network. Having such tools in the system is very important as the computer firewalls and OS security do not provide protection against ARP attacks.

Download latest Xarp version from http://xarp.software.informer.com/download/

After it gets downloaded, install it on your computer. Now, we will perform an attack on a system with Xarp installed  To show this tool’s effectiveness, we perform the attack with Bettercap

As soon as Xarp detects an ARP attack, it shows an alert on the screen like this.

It is to be noted that there was no such alert or blocking from both windows firewall and defender, but Xarp detects the intrusion and warns about it.

Author- Shivam Yadav is a certified ethical hacker, an enthusiast and a researcher in this field.

source:https://www.hackingarticles.in/detect-sniffer-network/

Posted in News

Woman wins $10,000 after suing Microsoft over ‘Forced’ Windows 10 Upgrade

Since the launch of Windows 10 in July last year, Microsoft is constantly pestering users to upgrade their PCs running older versions of the operating system.

However, many users who are happy with Windows 7 or Windows 8.1 and don’t want upgrade to Windows 10 now or anytime soon are sick of this forceful unwanted upgrade.

One of the victims to this unwanted Windows 10 installation has made Microsoft pay $10,000.

A California woman has won $10,000 from Microsoft over an unwanted Windows 10 upgrade.

Must Read: How to Stop Windows 7 or 8 from Downloading Windows 10 Automatically.

Teri Goldstein sued Microsoft for upgrading her computer to Windows 10 without her authorization, which made it slow and unusable for days at a time, reports the Seattle Times.

The PC used by Goldstein, who operates a Californian travel agency, was apparently upgraded to Windows 10 shortly after Microsoft offered free upgrade to Windows 7 and 8.1 users last year.

Goldstein said the update, which she never asked for, was so problematic that it left her PC crashing and unusable for days at a time. She contacted Microsoft’s tech support, but they were unable to assist her.

So, Goldstein sued Microsoft for lost wages and the cost of a new computer.

Must Read: Just Like Windows 10, Windows 7 and 8 Also Spy on You – Here’s How to Stop Them.

Microsoft dropped its appeal in May this year to avoid further legal expenses, and Goldstein won the court case last month, awarded with $10,000 from Microsoft.

Microsoft has been aggressively pushing Windows 10 for computers running Windows 7 and 8 since the beginning of this year by re-categorizing Windows 10 as a “Recommended Update” in Windows Update, instead of an “optional update.”

However, if the affected users started suing the software giant, then it could cost Microsoft a lot more than the actual cost of its newest Windows 10 operating system.

In response, Microsoft said users have a month to roll back to their previous operating system and can always contact its customer support.

However, the company will not stop ‘Upgrade to Windows 10′ notification from constantly showing up on your screen. Trust me; Microsoft has to achieve its goal to deploy Windows 10 on over 1 Billion devices worldwide as soon as possible.

Posted in Information Technology

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase vs Couchbase vs OrientDB vs Aerospike vs Neo4j vs Hypertable vs ElasticSearch vs Accumulo vs VoltDB vs Scalaris vs RethinkDB comparison

While SQL databases are insanely useful tools, their monopoly in the last decades is coming to an end. And it’s just time: I can’t even count the things that were forced into relational databases, but never really fitted them. (That being said, relational databases will always be the best for the stuff that has relations.)

But, the differences between NoSQL databases are much bigger than ever was between one SQL database and another. This means that it is a bigger responsibility on software architects to choose the appropriate one for a project right at the beginning.

In this light, here is a comparison of Open Source NOSQL databases CassandraMongodbCouchDBRedisRiakRethinkDBCouchbase (ex-Membase)Hypertable,ElasticSearchAccumuloVoltDBKyoto TycoonScalarisOrientDBAerospikeNeo4j and HBase:

The most popular ones

Redis (V3.2)

  • Written in: C
  • Main point: Blazing fast
  • License: BSD
  • Protocol: Telnet-like, binary safe
  • Disk-backed in-memory database,
  • Master-slave replication, automatic failover
  • Simple values or data structures by keys
  • but complex operations like ZREVRANGEBYSCORE.
  • INCR & co (good for rate limiting or statistics)
  • Bit and bitfield operations (for example to implement bloom filters)
  • Has sets (also union/diff/inter)
  • Has lists (also a queue; blocking pop)
  • Has hashes (objects of multiple fields)
  • Sorted sets (high score table, good for range queries)
  • Lua scripting capabilities
  • Has transactions
  • Values can be set to expire (as in a cache)
  • Pub/Sub lets you implement messaging
  • GEO API to query by radius (!)

Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory).

For example: To store real-time stock prices. Real-time analytics. Leaderboards. Real-time communication. And wherever you used memcached before.

Cassandra (2.0)

  • Written in: Java
  • Main point: Store huge datasets in “almost” SQL
  • License: Apache
  • Protocol: CQL3 & Thrift
  • CQL3 is very similar to SQL, but with some limitations that come from the scalability (most notably: no JOINs, no aggregate functions.)
  • CQL3 is now the official interface. Don’t look at Thrift, unless you’re working on a legacy app. This way, you can live without understanding ColumnFamilies, SuperColumns, etc.
  • Querying by key, or key range (secondary indices are also available)
  • Tunable trade-offs for distribution and replication (N, R, W)
  • Data can have expiration (set on INSERT).
  • Writes can be much faster than reads (when reads are disk-bound)
  • Map/reduce possible with Apache Hadoop
  • All nodes are similar, as opposed to Hadoop/HBase
  • Very good and reliable cross-datacenter replication
  • Distributed counter datatype.
  • You can write triggers in Java.

Best used: When you need to store data so huge that it doesn’t fit on server, but still want a friendly familiar interface to it.

For example: Web analytics, to count hits by hour, by browser, by IP, etc. Transaction logging. Data collection from huge sensor arrays.

MongoDB (3.2)

  • Written in: C++
  • Main point: JSON document store
  • License: AGPL (Drivers: Apache)
  • Protocol: Custom, binary (BSON)
  • Master/slave replication (auto failover with replica sets)
  • Sharding built-in
  • Queries are javascript expressions
  • Run arbitrary javascript functions server-side
  • Geospatial queries
  • Multiple storage engines with different performance characteristics
  • Performance over features
  • Document validation
  • Journaling
  • Powerful aggregation framework
  • On 32bit systems, limited to ~2.5Gb
  • Text search integrated
  • GridFS to store big data + metadata (not actually an FS)
  • Has geospatial indexing
  • Data center aware

Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks.

For example: For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.

ElasticSearch (0.20.1)

  • Written in: Java
  • Main point: Advanced Search
  • License: Apache
  • Protocol: JSON over HTTP (Plugins: Thrift, memcached)
  • Stores JSON documents
  • Has versioning
  • Parent and children documents
  • Documents can time out
  • Very versatile and sophisticated querying, scriptable
  • Write consistency: one, quorum or all
  • Sorting by score (!)
  • Geo distance sorting
  • Fuzzy searches (approximate date, etc) (!)
  • Asynchronous replication
  • Atomic, scripted updates (good for counters, etc)
  • Can maintain automatic “stats groups” (good for debugging)

Best used: When you have objects with (flexible) fields, and you need “advanced search” functionality.

For example: A dating service that handles age difference, geographic location, tastes and dislikes, etc. Or a leaderboard system that depends on many variables.

Classic document and BigTable stores

CouchDB (V1.2)

  • Written in: Erlang
  • Main point: DB consistency, ease of use
  • License: Apache
  • Protocol: HTTP/REST
  • Bi-directional (!) replication,
  • continuous or ad-hoc,
  • with conflict detection,
  • thus, master-master replication. (!)
  • MVCC – write operations do not block reads
  • Previous versions of documents are available
  • Crash-only (reliable) design
  • Needs compacting from time to time
  • Views: embedded map/reduce
  • Formatting views: lists & shows
  • Server-side document validation possible
  • Authentication possible
  • Real-time updates via ‘_changes’ (!)
  • Attachment handling
  • thus, CouchApps (standalone js apps)

Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important.

For example: CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments.

Accumulo (1.4)

  • Written in: Java and C++
  • Main point: A BigTable with Cell-level security
  • License: Apache
  • Protocol: Thrift
  • Another BigTable clone, also runs of top of Hadoop
  • Originally from the NSA
  • Cell-level security
  • Bigger rows than memory are allowed
  • Keeps a memory map outside Java, in C++ STL
  • Map/reduce using Hadoop’s facitlities (ZooKeeper & co)
  • Some server-side programming

Best used: If you need to restict access on the cell level.

For example: Same as HBase, since it’s basically a replacement: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement.

HBase (V0.92.0)

  • Written in: Java
  • Main point: Billions of rows X millions of columns
  • License: Apache
  • Protocol: HTTP/REST (also Thrift)
  • Modeled after Google’s BigTable
  • Uses Hadoop’s HDFS as storage
  • Map/reduce with Hadoop
  • Query predicate push down via server side scan and get filters
  • Optimizations for real time queries
  • A high performance Thrift gateway
  • HTTP supports XML, Protobuf, and binary
  • Jruby-based (JIRB) shell
  • Rolling restart for configuration changes and minor upgrades
  • Random access performance is like MySQL
  • A cluster consists of several different types of nodes

Best used: Hadoop is probably still the best way to run Map/Reduce jobs on huge datasets. Best if you use the Hadoop/HDFS stack already.

For example: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement.

Hypertable (0.9.6.5)

  • Written in: C++
  • Main point: A faster, smaller HBase
  • License: GPL 2.0
  • Protocol: Thrift, C++ library, or HQL shell
  • Implements Google’s BigTable design
  • Run on Hadoop’s HDFS
  • Uses its own, “SQL-like” language, HQL
  • Can search by key, by cell, or for values in column families.
  • Search can be limited to key/column ranges.
  • Sponsored by Baidu
  • Retains the last N historical values
  • Tables are in namespaces
  • Map/reduce with Hadoop

Best used: If you need a better HBase.

For example: Same as HBase, since it’s basically a replacement: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement.

Graph databases

OrientDB (2.0)

  • Written in: Java
  • Main point: Document-based graph database
  • License: Apache 2.0
  • Protocol: binary, HTTP REST/JSON, or Java API for embedding
  • Has transactions, full ACID conformity
  • Can be used both as a document and as a graph database (vertices with properties)
  • Both nodes and relationships can have metadata
  • Multi-master architecture
  • Supports relationships between documents via persistent pointers (LINK, LINKSET, LINKMAP, LINKLIST field types)
  • SQL-like query language (Note: no JOIN, but there are pointers)
  • Web-based GUI (quite good-looking, self-contained)
  • Inheritance between classes. Indexing of nodes and relationships
  • User functions in SQL or JavaScript
  • Sharding
  • Advanced path-finding with multiple algorithms and Gremlin traversal language
  • Advanced monitoring, online backups are commercially licensed

Best used: For graph-style, rich or complex, interconnected data.

For example: For searching routes in social relations, public transport links, road maps, or network topologies.

Neo4j (V1.5M02)

  • Written in: Java
  • Main point: Graph database – connected data
  • License: GPL, some features AGPL/commercial
  • Protocol: HTTP/REST (or embedding in Java)
  • Standalone, or embeddable into Java applications
  • Full ACID conformity (including durable data)
  • Both nodes and relationships can have metadata
  • Integrated pattern-matching-based query language (“Cypher”)
  • Also the “Gremlin” graph traversal language can be used
  • Indexing of nodes and relationships
  • Nice self-contained web admin
  • Advanced path-finding with multiple algorithms
  • Indexing of keys and relationships
  • Optimized for reads
  • Has transactions (in the Java API)
  • Scriptable in Groovy
  • Clustering, replication, caching, online backup, advanced monitoring and High Availability are commercially licensed

Best used: For graph-style, rich or complex, interconnected data.

For example: For searching routes in social relations, public transport links, road maps, or network topologies.

The “long tail”
(Not widely known, but definitely worthy ones)

Couchbase (ex-Membase) (2.0)

  • Written in: Erlang & C
  • Main point: Memcache compatible, but with persistence and clustering
  • License: Apache
  • Protocol: memcached + extensions
  • Very fast (200k+/sec) access of data by key
  • Persistence to disk
  • All nodes are identical (master-master replication)
  • Provides memcached-style in-memory caching buckets, too
  • Write de-duplication to reduce IO
  • Friendly cluster-management web GUI
  • Connection proxy for connection pooling and multiplexing (Moxi)
  • Incremental map/reduce
  • Cross-datacenter replication

Best used: Any application where low-latency data access, high concurrency support and high availability is a requirement.

For example: Low-latency use-cases like ad targeting or highly-concurrent web apps like online gaming (e.g. Zynga).

Scalaris (0.5)

  • Written in: Erlang
  • Main point: Distributed P2P key-value store
  • License: Apache
  • Protocol: Proprietary & JSON-RPC
  • In-memory (disk when using Tokyo Cabinet as a backend)
  • Uses YAWS as a web server
  • Has transactions (an adapted Paxos commit)
  • Consistent, distributed write operations
  • From CAP, values Consistency over Availability (in case of network partitioning, only the bigger partition works)

Best used: If you like Erlang and wanted to use Mnesia or DETS or ETS, but you need something that is accessible from more languages (and scales much better than ETS or DETS).

For example: In an Erlang-based system when you want to give access to the DB to Python, Ruby or Java programmers.

Aerospike (3.4.1)

  • Written in: C
  • Main point: Speed, SSD-optimized storage
  • License: License: AGPL (Client: Apache)
  • Protocol: Proprietary
  • Cross-datacenter replication is commercially licensed
  • Very fast access of data by key
  • Uses SSD devices as a block device to store data (RAM + persistence also available)
  • Automatic failover and automatic rebalancing of data when nodes or added or removed from cluster
  • User Defined Functions in LUA
  • Cluster management with Web GUI
  • Has complex data types (lists and maps) as well as simple (integer, string, blob)
  • Secondary indices
  • Aggregation query model
  • Data can be set to expire with a time-to-live (TTL)
  • Large Data Types

Best used: Any application where low-latency data access, high concurrency support and high availability is a requirement.

For example: Storing massive amounts of profile data in online advertising or retail Web sites.

RethinkDB (2.1)

  • Written in: C++
  • Main point: JSON store that streams updates
  • License: License: AGPL (Client: Apache)
  • Protocol: Proprietary
  • JSON document store
  • Javascript-based query language, “ReQL”
  • ReQL is functional, if you use Underscore.js it will be quite familiar
  • Sharded clustering, replication built-in
  • Data is JOIN-able on references
  • Handles BLOBS
  • Geospatial support
  • Multi-datacenter support

Best used: Applications where you need constant real-time upates.

For example: Displaying sports scores on various displays and/or online. Monitoring systems. Fast workflow applications.

Riak (V1.2)

  • Written in: Erlang & C, some JavaScript
  • Main point: Fault tolerance
  • License: Apache
  • Protocol: HTTP/REST or custom binary
  • Stores blobs
  • Tunable trade-offs for distribution and replication
  • Pre- and post-commit hooks in JavaScript or Erlang, for validation and security.
  • Map/reduce in JavaScript or Erlang
  • Links & link walking: use it as a graph database
  • Secondary indices: but only one at once
  • Large object support (Luwak)
  • Comes in “open source” and “enterprise” editions
  • Full-text search, indexing, querying with Riak Search
  • In the process of migrating the storing backend from “Bitcask” to Google’s “LevelDB”
  • Masterless multi-site replication and SNMP monitoring are commercially licensed

Best used: If you want something Dynamo-like data storage, but no way you’re gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you’re ready to pay for multi-site replication.

For example: Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt. Could be used as a well-update-able web server.

VoltDB (2.8.4.1)

  • Written in: Java
  • Main point: Fast transactions and rapidly changing data
  • License: AGPL v3 and proprietary
  • Protocol: Proprietary
  • In-memory relational database.
  • Can export data into Hadoop
  • Supports ANSI SQL
  • Stored procedures in Java
  • Cross-datacenter replication

Best used: Where you need to act fast on massive amounts of incoming data.

For example: Point-of-sales data analysis. Factory control systems.

Kyoto Tycoon (0.9.56)

  • Written in: C++
  • Main point: A lightweight network DBM
  • License: GPL
  • Protocol: HTTP (TSV-RPC or REST)
  • Based on Kyoto Cabinet, Tokyo Cabinet’s successor
  • Multitudes of storage backends: Hash, Tree, Dir, etc (everything from Kyoto Cabinet)
  • Kyoto Cabinet can do 1M+ insert/select operations per sec (but Tycoon does less because of overhead)
  • Lua on the server side
  • Language bindings for C, Java, Python, Ruby, Perl, Lua, etc
  • Uses the “visitor” pattern
  • Hot backup, asynchronous replication
  • background snapshot of in-memory databases
  • Auto expiration (can be used as a cache server)

Best used: When you want to choose the backend storage algorithm engine very precisely. When speed is of the essence.

For example: Caching server. Stock prices. Analytics. Real-time data collection. Real-time communication. And wherever you used memcached before.

source:

https://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

 

Posted in Devops, Information Technology

Solutions Automation Cheat Sheet

Classical Automation Tools

Framework DSL Push/Pull CM CM Encryption Drift Management Job Scheduling Orchestration
Ansible Propietary Push Built-in Built-in ? Ansible Tower Ansible Tower
cfengine Propietary Push/Pull ? ? ? Enterprise Only ?
Puppet Ruby Push/Pull Hiera Hiera Eyaml Foreman, PuppetDB, Polscan Puppet Enterprise Puppet Enterprise, mcollective
Chef Ruby Push/Pull Builtin Builtin Pushy, (knife plugin + ZeroMQ) %
Saltstack Python Push Builtin Builtin ? salt-run Saltstack Enterprise

Smaller automation tools

  • Bcfg2: Alternative to puppet and cfengine by Argonne National Laboratory. (IMO out-dated)
  • cdist: configuration with shell scripting
  • EMC UIM
    • Unified Infrastructure Manager, VCE VBlock (enterprise, commercial)
  • slaughter (Perl, active, small user base)
  • Sprinkle (Ruby, quite recent)
  • Rundeck – Workflow manager for node – role systems like EC2, chef, puppet …
  • IBM Tivoli

Finally it is worth to check the Wikipedia Comparison Chart for other less known and new tools!

Automation Drift Management

Testing

Misc

  • Augeas: Very flexible file editor to be used with Puppet or standalone. Could also work with cfengine.
    $ augtool
    augtool> set /files/etc/ssh/sshd_config/PermitRootLogin no
    augtool> save
    
  • cfengine – Force running shortly after a recent execution
    cfagent -K
    
  • cfengine – Design Center: Git repository with sketches and examples for cfengine.
  • cfengine – Find and install sketches from the Design Center repository
    # cf-sketch --search utilities
    Monitoring::nagios_plugin_agent /tmp/design-center/sketches/utilities/nagios_plugin_agent
    [...]
    # cf-sketch --install Monitoring::nagios_plugin_agent
    
  • SaltStack – Run commands
    salt '*' cmd.run 'apt-get install bash'
    
  • SaltStack – Batch concurrency
    salt '*' state.highstate -b <count>
    
  • osquery – Facebook SQL facter
    echo "SELECT * FROM etc_hosts;" | osqueryi
    
    $ osqueryi
    osquery> SELECT
        ...> u.username,
        ...> g.groupname
        ...> FROM users as u
        ...> JOIN groups as g ON u.gid = g.gid;
Posted in Information Technology

SaaS Dev Tools Cheat Sheet

Online Tools

SaaS Discovery

Posted in Information Technology

Dev-Misc Cheat Sheet

Misc

  • JSON Linting
    python -mjson.tool input.json
    
  • iotrack: LD_PRELOAD based I/O tracking
  • Linux Debugging Techniques: DeveloperWorks article on many debugging tools: MEMWATCH, YAMD, electric fence, gdb, kgdb, kdb
  • Google Address Sanitizer (Asan) for GCC 4.8+ and LLVM
  • cppcheck – static code analysis
  • ELF Inspection
    readelf -l <binary
    
  • kcachegrind: callgrind visualization
  • Object Dumping
    objdump -t <object file>   # print symbols table
    objdump -dS <object file>  # print assembly along source lines
    
  • Fedora – Crash Tracker retrace.fedoraproject.org/faf
  • Ubuntu – Crash Tracker errors.ubuntu.com
  • Java – Debugging Flags
    J-Xdebug -J-Xnoagent -J-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=9876
    
  • Gearman – Jobserver

Security

Test Frameworks

Low-level C test frameworks:

  • Glib Testing
  • lcov – GCC based test coverage metrics:
    apt-get install lcov
    CLAGS=--coverage ./configure
    # Run tests
    lcov --capture --directory <project-dir> --output-file coverage.info
    genhtml coverage.info --output-directory out
    
  • Ruby rspec – Launch tests
    # There are a lot of rspec launch variants:
    autotest
    rspec <path to .rb spec file>
    rspec <path to directory>
    bundle exec rspec <path to .rb spec file>
    

Java

Web

XML

  • Pretty-print XML:
    xmllint --format my.xml
    
  • XPath on the command line
    # Print subtree of tag 'sometag'
    xmllint --xpath "//sometag" data.xml
    
    # Match an attribute 'someattr' of 'sometag' to have literal 'string'
    xmllint --xpath "//sometag[contains(@someattr, 'string')" data.xml