Thomas VachonJekyll2019-02-19T20:33:40+00:00/Thomas Vachon/contactme@thomasvachon.com/articles/using-aws-athena-and-cloudtrail-revisited2018-04-17T04:00:00+00:002018-04-17T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<h2 id="whats-new-with-athena-and-cloudtrail">What’s new with Athena and CloudTrail</h2>
<p>AWS has made great strides to make CloudTrail far more useful in the past year. Recently AWS has provided a point & click wizard in CloudTrail to setup Athena validating the strengths of this approach but they stop short of giving great guidance on how to use and scale it.</p>
<h4 id="acknowledgements">Acknowledgements</h4>
<p>I want to thank Corcoran Smith <a href="https://twitter.com/corcoranCI">@corcoranCI</a> for reminding me to update this article.</p>
<h3 id="setting-up-the-tables">Setting Up the Tables</h3>
<p>AWS released the CloudTrail SerDe sometime after my last post and I have been using for the past 6 to 9 months. If you look at the <a href="/articles/using-aws-athena-to-query-cloudtrail-logs">last article</a> you will notice that there was a very complicated CREATE TABLE statement, luckily that has changed to this:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql">CREATE EXTERNAL TABLE my_table_name (
eventversion STRING,
userIdentity STRUCT< type:STRING,
principalid:STRING,
arn:STRING,
accountid:STRING,
invokedby:STRING,
accesskeyid:STRING,
userName:STRING,
sessioncontext:STRUCT< attributes:STRUCT< mfaauthenticated:STRING,
creationdate:STRING>,
sessionIssuer:STRUCT< type:STRING,
principalId:STRING,
arn:STRING,
accountId:STRING,
userName:STRING>>>,
eventTime STRING,
eventSource STRING,
eventName STRING,
awsRegion STRING,
sourceIpAddress STRING,
userAgent STRING,
errorCode STRING,
errorMessage STRING,
requestParameters STRING,
responseElements STRING,
additionalEventData STRING,
requestId STRING,
eventId STRING,
resources ARRAY<STRUCT< ARN:STRING,
accountId: STRING,
type:STRING>>,
eventType STRING,
apiVersion STRING,
readOnly STRING,
recipientAccountId STRING,
serviceEventDetails STRING,
sharedEventID STRING,
vpcEndpointId STRING
) PARTITIONED BY(
region string,
year string,
month string
)
ROW FORMAT SERDE 'com.amazon.emr.hive.serde.CloudTrailSerde'
STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://my_consolidated_bucket/my-cross-account-prefix/AWSLogs/</code></pre></figure>
<p>Some of the key changes are in how the data is parsed, there is less de-nesting of JSON but ultimately its much easier to query now. Also, I ensure I added Partitions to the tables, I’ll explain why that is important later, but do it now and I’ll show you how to automate it next.</p>
<h3 id="getting-started-with-partitions">Getting Started With Partitions</h3>
<p>What I discovered when I was querying data going back over time was inefficiencies and more importantly increased cost. Partitions in Athena are the right way to solve this but to do that you have to add them individually to each table.</p>
<p>Given the amount of logs I have and the infrequency which I look at some regions, I decided to partition on region, year, and month. To do that I first looked at Boto3, but unfortunately as of this writing there still is not a Waiter function for Athena queries. It can be very easy to overrun any quotas or limits on the DDL statements on concurrent query limits, so I went looking and found the fantastic overlay on Boto3/CLI called <a href="https://github.com/guardian/athena-cli">athena-CLI</a> which I can not recommend more highly.</p>
<p>To add the partitions, I loaded up a script and used the waiters native in athena-cli to ensure I didn’t overrun. I added some concurrency to keep it under my DDL limit but to add some speed improvements.</p>
<p>For example, here is a query to add a partition to us-east-1 for April 2018 for account “999999999999”</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql">ALTER TABLE my_table_name ADD PARTITION (region='us-east-1',year='2018',month='04')
location 's3://my_consolidated_bucket/my-cross-account-prefix/AWSLogs/999999999999/CloudTrail/us-east-1/2018/04/';</code></pre></figure>
<p>Also, you can pre-partition your data, so I generally load up a year’s worth of partitions at once. Athena does not care if the folder is present or not when you setup the partition. It is important to note, if you setup partitions in your schema, if you do not create them, you will never see the data when you query.</p>
<p><strong>Note:</strong> When AWS presents you with the DDL from the CloudTrail screen, it does not contain partitions, I strongly encourage you to add them</p>
<h3 id="working-with-the-data">Working with the data</h3>
<p>Lets look at the structure of a few records as they appear now.</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">eventversion | 1.04
useridentity | {type=IAMUser, principalid=LKUWHE3545KJ34534L65U,
arn=arn:aws:iam::999999999999:user/iam-username,
accountid=999999999999, invokedby=null,
accesskeyid=FAKEACCESSID, username=iam-username,
sessioncontext=null}
eventtime | 2018-04-04T23:55:30Z
eventsource | monitoring.amazonaws.com
eventname | DescribeAlarms
awsregion | us-east-1
sourceipaddress | 198.51.100.144
useragent | aws-cli/1.11.132
errorcode | NULL
errormessage | NULL
requestparameters | {"alarmNames":["alarmname"]}
responseelements | null
additionaleventdata | NULL
requestid | 09a5f980-13a2-48af-94d7-f27a2affbdbe
eventid | 55979b8b-494f-4c8f-9cf9-3edaadefe142
resources | NULL
eventtype | AwsApiCall
apiversion | NULL
readonly | NULL
recipientaccountid | 999999999999
serviceeventdetails | NULL
sharedeventid | NULL
vpcendpointid | NULL
region | us-east-1
year | 2018
month | 04</code></pre></figure>
<figure class="highlight"><pre><code class="language-text" data-lang="text">eventversion | 1.05
useridentity | {type=AssumedRole, principalid=LKUWHE3545KJ34534L65U:user@example.com,
arn=arn:aws:sts::999999999999:assumed-role/rolename/user@example.com,
accountid=999999999999, invokedby=null, accesskeyid=null,
username=null, sessioncontext=null}
eventtime | 2018-03-12T12:00:37Z
eventsource | signin.amazonaws.com
eventname | ConsoleLogin
awsregion | us-east-1
sourceipaddress | 198.51.100.144
useragent | Chrome/65.0.3325.146
errorcode | NULL
errormessage | NULL
requestparameters | null
responseelements | {"ConsoleLogin":"Success"}
additionaleventdata | {"LoginTo":"https://console.aws.amazon.com/console/home?region=us-east-1",
"MobileVersion":"No","MFAUsed":"No",
"SamlProviderArn":"arn:aws:iam::999999999999:saml-provider/MySamlIdp"}
requestid | NULL
eventid | 96b00be0-6600-4489-8f94-3f70b04c4a66
resources | NULL
eventtype | AwsConsoleSignIn
apiversion | NULL
readonly | NULL
recipientaccountid | 999999999999
serviceeventdetails | NULL
sharedeventid | NULL
vpcendpointid | NULL
region | us-east-1
year | 2018
month | 03</code></pre></figure>
<p>Looking at these data sets, you get simpler queries. Looking at the first data set, here is a query which would have that record in its output, as well as others:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql">SELECT * FROM my_table_name
WHERE useridentity.username = 'iam-username'
AND year = '2018'
AND month = '03';</code></pre></figure>
<p>In this query you can see that useridentity allowed dotted notation addressing of sub-fields which allows for very powerful queries using the Presto framework including Regular Expressions. The other columns which can be addressed normally via column names, again becoming much simpler.</p>
<h3 id="query-and-performance-comparison">Query and Performance Comparison</h3>
<p>Now that we see the data is a bit easier to comprehend, how much easier is it to write? Also as important is how much faster and efficient is it to run? To do this test, I ran the following two queries against my largest account.</p>
<p>I am looking to find the 20 highest counts of a tuple of Event Names/ARN/SourceIP for March 2018 in us-east-1 only.</p>
<h4 id="previous-style">Previous Style</h4>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql">SELECT record.eventName, record.userIdentity.arn, record.sourceIPAddress, COUNT(*)
FROM
(SELECT record
FROM my_table_name
CROSS JOIN UNNEST(records) AS t (record)) AS records
WHERE record.eventtime LIKE '2018-03-%' and record.awsregion = 'us-east-1'
GROUP BY record.eventName, record.userIdentity.arn, record.sourceIPAddress
ORDER BY COUNT(*) DESC
LIMIT 20;</code></pre></figure>
<h4 id="current-style">Current Style</h4>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql">select eventname, useridentity.arn, sourceipaddress, count(*)
from my_table_name
where year = '2018'
and month = '03'
and region = 'us-east-1'
group by eventname, useridentity.arn, sourceipaddress
order by count(*) DESC
LIMIT 20</code></pre></figure>
<p>The old table and query format:</p>
<ul>
<li>449.72 seconds</li>
<li>Scanned 13.97GB of data</li>
<li>Cost $0.06985</li>
</ul>
<p>The new table and query format:</p>
<ul>
<li>15.03 seconds</li>
<li>Scanned 1.2GB of data</li>
<li>Cost $0.006</li>
</ul>
<h3 id="conclusion">Conclusion</h3>
<p>In summary, the new table structure and queries are much faster, cheaper, and easier. In fact, on average, they are 500x cheaper and 400% faster. There really is little disadvantage to changing to the new schema.</p>
<p>Do you have any cool queries you wrote to summarize your data? I would love to hear from you.</p>
<p><a href="/articles/using-aws-athena-and-cloudtrail-revisited/">Using AWS Athena and CloudTrail Revisited</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on April 17, 2018.</p>
/articles/investigating-AWS-pricing-over-time2018-04-17T04:00:00+00:002018-04-17T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<h2 id="going-down-the-rabbit-hole---a-historical-look-at-aws-pricing">Going Down the Rabbit Hole - A Historical Look at AWS Pricing</h2>
<p>When a coworker asked me if AWS had a historical pricing sheet, I was astounded to find out the answer was no. I went digging into the AWS landscape to find the answers and here is what I found. AWS has publicly reduced its pricing across various services 62 or 65 times depending on who you ask or what metric you utilize. With these changes it becomes hard to understand historical pricing trends for AWS other than they make it cheaper as their internal modeling allows.</p>
<h3 id="selecting-the-target">Selecting the Target</h3>
<p>I decided to deep dive on this topic with the one service which has the longest pricing history and the most consistent formats, S3. For those of you who have never looked or might have forgotten, S3 was one of the first three offerings out of the AWS team with EC2 and SQS being the other two. While the SQS service is the eldest, it hasn’t seen many price reductions in its lifetime. Part of this is due to that its a managed offering but more likely this is due to the fact that its not really tied to economies of scale. The next logical place to look naturally would be EC2. EC2 has a long history of price reductions; however, they are hard to track with the constant enhancements of the instance families. Older instance types are hardly ever destroyed but their prices are usurped by the newer generations. As expected, this is where AWS is showing their scale and purchasing power but it creates unnatural plateaus and drops within a long term view of the service pricing. So that leaves the investigation to the “youngest” of the three elder’s children, S3.</p>
<h3 id="testing-methodology">Testing Methodology</h3>
<p>In selecting S3, we get several benefits to our analysis:</p>
<ol>
<li>a singular class of storage since inception providing clear pricing since 2006</li>
<li>a view of the AWS purchasing power compared against trackable consumables such as drive capacity</li>
<li>a trend which demonstrates how AWS treats smaller consumers versus the largest consumers of a service over time</li>
</ol>
<p>Looking at all the price reductions, I normalized all the data from 2006 through the last reduction in March 2014. To do this I have to look at the current S3 Tiers and apply them retroactively to the pricing from the past which provided the pricing regardless of the tiers in use at the time. This was surprisingly time consuming as some were written as “0-50TB” while others were additive in nature “next 100 TB”.</p>
<h3 id="historical-pricing">Historical Pricing</h3>
<p>In the graph below, you can see this normalized data. The tiers are arranged smallest to largest and the chart also ages oldest to newest in the bar chart groupings.</p>
<p class="rightforce"><a href="/assets/images/s3-costs-overtime.png" target="_blank"><img src="/assets/images/s3-costs-overtime.png" alt="S3 Costs Overtime" /></a>
<em><sup>Click to expand</sup></em></p>
<p>What I see in that data is that AWS …</p>
<ul>
<li>has never increased prices, but has not always made it cheaper</li>
<li>decreases costs in specific set of tiers (e.g. mid volume or high volume) most of the time</li>
<li>drove down their internal costs in 2008, 2012, and 2014 to provide the highest discounts</li>
</ul>
<h3 id="purchasing-power">Purchasing Power</h3>
<p>What surprises me in those charts is actually how steady prices are at the higher volume tiers. As a result, I went looking for a historical price record and found one at Backblaze. I want to thank <a href="https://www.backblaze.com/">Backblaze</a> for letting use this image. I find this a particularly interesting graph because these are prices reflected at a volume scale. These are more representative of what AWS would pay than what I would pay at my local retailer.</p>
<p class="rightforce"><a href="/assets/images/backblaze-chart-cost-per-drive-2017.png" target="_blank"><img src="/assets/images/backblaze-chart-cost-per-drive-2017.png" alt="Costs for Hard Drives" /></a>
<em><sup>Click to expand</sup></em></p>
<p>The chart depicted does not start in 2008, but I would venture to say that the savings in 2008 are more of an architectural update because the prices do not drastically fall as the usage increases. When we start looking at 2012 and 2014, things get more interesting. In 2012, 4TB drives premiered around $0.08/GB which was how much 2TB cost in 2010; however, even more telling was the drastic price decrease in the 3TB drives. As a result of these supply change price decreases, we see a nearly 40% drop in in storage over 50TB compared to their 2010 pricing.</p>
<h3 id="volume-customers">Volume Customers</h3>
<p>To see how AWS treats volume customers, I think its important to look at the tiers which they supply. If we look at the this table, we can see where AWS has decided to reward/penalize customers to storing too little over time or too much too early on.</p>
<table class="mbtablestyle">
<thead>
<tr>
<th style="text-align: center">Tier</th>
<th style="text-align: center">2006</th>
<th style="text-align: center">2008</th>
<th style="text-align: center">2009</th>
<th style="text-align: center">2010</th>
<th style="text-align: center">2012</th>
<th style="text-align: center">2014</th>
<th style="text-align: center">2016</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center">None</td>
<td style="text-align: center">X</td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
</tr>
<tr>
<td style="text-align: center">0-1 TB</td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"><strong>N</strong></td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
</tr>
<tr>
<td style="text-align: center">0-50 TB</td>
<td style="text-align: center"> </td>
<td style="text-align: center"><strong>N</strong></td>
<td style="text-align: center">X</td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
</tr>
<tr>
<td style="text-align: center">1 - 50 TB</td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"><strong>N</strong></td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
</tr>
<tr>
<td style="text-align: center">50-100 TB</td>
<td style="text-align: center"> </td>
<td style="text-align: center"><strong>N</strong></td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
</tr>
<tr>
<td style="text-align: center">100-500 TB</td>
<td style="text-align: center"> </td>
<td style="text-align: center"><strong>N</strong></td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
</tr>
<tr>
<td style="text-align: center">50 - 500 TB</td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
</tr>
<tr>
<td style="text-align: center">500 - 1000 TB</td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"><strong>N</strong></td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
</tr>
<tr>
<td style="text-align: center">1000-5000 TB</td>
<td style="text-align: center"> </td>
<td style="text-align: center"> </td>
<td style="text-align: center"><strong>N</strong></td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
<td style="text-align: center">X</td>
</tr>
</tbody>
</table>
<p>Legend: <strong>N</strong> = Newly Added, X = Pre-existing & carried forward</p>
<p>What you can see in this data is clearly AWS rewards or more correctly, incentives, Data Gravity. In 2009 they cut the prices for the top 3 buckets while leaving the rest the same. In the next two increases they dropped their prices for the middle, and again in 2014 and 2016 they drastically cut prices on the higher end. Additionally, over time, they increases the “width” of the middle buckets as well. All this indicates that AWS wants to host all of your data which they use as a method to move their users into other services beyond pure storage, surprising no one.</p>
<h3 id="conclusion">Conclusion</h3>
<p>I hope this sheds some light on how AWS, or any cloud provider for that matter, is reevaluating their internal costs and adjusting to provide the best value to their users. If you would like the raw data, please do not hesitate to reach out to me.</p>
<p><a href="/articles/investigating-AWS-pricing-over-time/">Investigating AWS Pricing over Time</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on April 17, 2018.</p>
/articles/so-lets-talk-about-multi-cloud.12018-01-08T04:00:00+00:002018-01-08T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<h2 id="multi-cloud-a-myth-or-a-practical-reality">Multi-Cloud: A myth or a practical reality?</h2>
<p>For many businesses, the desire of being on one, two, or even three Cloud vendors is attractive to mitigate financial risks due to untenable price increases and business risks around availability of their critical systems. For other businesses, they want to provide a global reach to their customers where some providers excel in those regions while others do not perform within acceptable parameters.</p>
<h3 id="what-is-a-cloud">What is a Cloud?</h3>
<p>The more pressing question, and frankly the harder one to answer, is “What is a Cloud?”. The Cloud can be many things to many people, from the simple systems administrator version of “someone else’s computers in someone else’s data center”. If you ask a business analyst, they may tell you that Salesforce is a Cloud or possible Google’s gSuite is a Cloud. You would be amiss if you argued that either of those points of view is either invalid or even misguided.</p>
<p>So, for the brevity of such an article, I will be talking about IaaS vendors going forward. Some of this will also apply to PaaS vendors, but we will not be focusing on SaaS vendors at all.</p>
<h3 id="examining-the-risks">Examining the Risks</h3>
<p>The problem with most enterprises is they have largely decided to build systems in the IaaS Clouds much as they do on-premises. In some vendor offerings, such as vCloud Air or VMware on AWS, if you have the overlay networks and virtual SAN’s in place on-premises today, that largely works. The problems start when you move to offerings which are not like for like from what you did in the past.</p>
<h4 id="financial-risk">Financial Risk</h4>
<p>This risk is by far the least understood risk by most technologists but one of the most important to the business. As part of the Cloud transition, costs largely are moved from Capital Expenditures (CapEx) to Operational Expenditures (OpEx). This important to understand as CapEx can be seen as a transfer of assets from cash to equipment under the Generally Accepted Accounting Practices (GAAP). As such, the company is not expending money, its transferring money into non-monetary instruments which devalue over time, a process known as amortization. As such, one day one, a million dollar mainframe didn’t impact the balance sheet as a million dollars of cash gone out the door.</p>
<p>This all changes in the Cloud with some exceptions such as reservations. When companies spend money in the Cloud, its generally considered an OpEx usage by GAAP which means a dollar spent is a dollar not in the company’s balance sheet.</p>
<h2 id="what-does-it-mean-to-be-multi-cloud">What does it mean to be Multi-Cloud</h2>
<p>With the risks well understood, the question is now what does it actually mean to be Multi-Cloud. There are several camps of thought around this topic and I would summarize them as:</p>
<ol>
<li>Syncing your static backups to another provider</li>
<li>Using vendor agnostic provisioning systems (e.g. Terraform, Puppet, Ansible, etc.) and having a copy of warm data in another provider</li>
<li>Actively running your app(s) across multiple Cloud vendors at once</li>
</ol>
<p>Clearly these are cumulative, you cannot do #3 without #1 and #2. These are also presented in levels of increasing difficulty and complexity.</p>
<p>I would argue, for about 70% of companies, you should just do #3 and have a runbook on how to use the second provider by hand if required mitigating a potential financial risk due to increased single provider costs. For most of that 70%, to ensure you have durability of your application, using multiple regions in your primary provider is the correct way to reduce business risk.</p>
<p>In part 2 of this series, I will investigate what you need, when you should do it, what to avoid, and what’s the first step</p>
<p><a href="/articles/so-lets-talk-about-multi-cloud.1/">So Let's Talk About Multi Cloud</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on January 08, 2018.</p>
/articles/making-modular-cloudformation-with-includes2017-05-08T04:00:00+00:002017-05-08T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<h2 id="pushing-the-cloudformation-bleeding-edge-native-modular-templates">Pushing the CloudFormation Bleeding Edge: Native Modular Templates</h2>
<p>When the YAML format for CloudFormation was launched in September 2016, many of the users knew it was only a matter of time until the commonly used pattern of including multiple YAML files into a single file made its way into CloudFormation. On March 28, 2017, AWS did exactly that by launching the AWS::Include Transform, albeit with surprising lack of fanfare.</p>
<p>While YAML was not a prerequisite to having this feature, it made it infinitely easier leverage as an end-user. There are very important things which I have discovered as I integrated AWS::Include into my daily work; some of these are documented fully, others are documented partially, and others not at all.</p>
<h3 id="terms-of-use">Terms of Use:</h3>
<ul>
<li>Partials - The snippets of CloudFormation stored in S3</li>
<li>Master - The template executed by the end-user</li>
<li>Includes - The AWS::Includes Transform</li>
</ul>
<h3 id="key-points">Key Points</h3>
<ul>
<li>Your Partials may be in either JSON or YAML</li>
<li>Partials must use only the long form of a Function Call (e.g Fn::Sub not !Sub)</li>
<li>Change sets are required for use of Includes</li>
<li>Partials must be accessible by the end-user’s STS assumption for CloudFormation
<ul>
<li>Public ACL Read is not required if you have a good bucket policy</li>
</ul>
</li>
<li>Partials are included into the Master <strong>before</strong> evaluation of functions
<ul>
<li>Prevents using Fn::Sub in a Location directive for dev/prod s3 path of the Partials</li>
</ul>
</li>
<li>Errors which occur in Partials create unusual errors on evaluation</li>
<li>Understanding scope is very important</li>
<li>Nested Includes calls is not supported (i.e. your partials template cannot have another partial in it)</li>
</ul>
<p>Now I will dive into some of these key points in detail, use cases for where to AWS::Include, and lessons learned from living on the bleeding edge.</p>
<p>In a future post, I will detail use cases where Includes is ideal for your business such as creating predictable IAM Roles or a multi-engine RDS template</p>
<h3 id="key-points-in-details">Key Points in Details</h3>
<h5 id="execution-model">Execution Model</h5>
<p>The execution model when using Includes is through CloudFormation Change Sets, which is a great way to enforce a known checkpoint but brings in difficulties for people who don’t use CloudFormation daily. When you use a Includes and you want to make a new stack, you are left with two options:</p>
<ol>
<li>Create the stack within the AWS Console - the console automatically creates a blank stack, change set, and prompts for CAPABILITY_NAMED_IAM</li>
<li>Create a “blank” stack (e.g. just a wait handle) and then create a change set against that stack with CAPABILITY_NAMED_IAM in the create-change-set call</li>
</ol>
<h5 id="mixing-json-and-yaml">Mixing JSON and YAML</h5>
<p>This adventure to using Includes was my first significant effort into using YAML for all my CloudFormation, up until this point the need for YAML (specifically inline comments) was not worth the time it would take to rewrite what we had been using to date.</p>
<p>One of the use-cases I leveraged Includes for is deploying IAM Policies and Roles for Federated Login as it requires predictable Role naming. I have found in practice it is easier to write the policies in JSON but as I was doing YAML now, I decided to keep it in pure YAML for readability.</p>
<p>AWS has, thankfully, provided the ability to continue keeping your Partials in either language regardless of the Master’s language. You may choose to keep your JSON templates for things like IAM Policies and use YAML for the Master.</p>
<p>What I did leverage from time to time was cfn-flip to ensure my YAML syntax was inline with the JSON evaluation. As the templates are included they are converted to JSON, so doing this is a reasonable checkpoint for yourself.</p>
<h5 id="evaluation-logic-and-order">Evaluation Logic and Order</h5>
<p>As I said above you cannot have any other Function evaluated before the includes happens. This means that you cannot do something like this:</p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml">Parameters:
PartialsEnv:
Type: String
Default: prod
AllowedValues:
- prod
- dev
...
Resources:
MyTestResource:
Fn::Transform:
Name: AWS::Include
Parameters:
Location:
Fn::Sub: s3://my-partials-bucket/${PartialsEnv}/resources/test_resource.yaml</code></pre></figure>
<p>This will throw an error that you must provide a valid S3 URI/Object. I have raised this with support and an RFE has been created to allow this or something like it be accepted.</p>
<p>While the order of operations is not specific to Includes, I was bitten by the unwritten order more than once. For your reference, I have compiled an incomplete list of the evaluation precedence steps here:</p>
<ol>
<li>Mappings</li>
<li>References Lookups</li>
<li>Conditional Statements</li>
<li>Substitutions</li>
</ol>
<p>I will try to obtain the information on a complete list of these in the future and make a separate post on that.</p>
<p>The reason I included these is that you are more likely to run into trying to do Mappings inside Conditionals using Reference Lookups and that will fail with unpredictable results.</p>
<h5 id="errors--debugging">Errors & Debugging</h5>
<p>I have compiled a list of the most common issues I have run into within Includes and YAML in general</p>
<table>
<thead>
<tr>
<th style="text-align: center">Common Errors</th>
<th style="text-align: center">Causes</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center">Circular Dependancies</td>
<td style="text-align: center">Incorrect/Invalid Reference in the Partial</td>
</tr>
<tr>
<td style="text-align: center">Invalid Policy Syntax for IAM on execution</td>
<td style="text-align: center">A <code>*</code> IAM Policy Resource was not <code>"*"</code></td>
</tr>
<tr>
<td style="text-align: center">Invalid MajorEngineVersion ##.# for SqlServer Option Group</td>
<td style="text-align: center">SqlServer Option Groups are ##.## and during the YAML to JSON conversion, it drops the superfluous 4th digit, quoting such as <code>"13.00"</code> fixes this</td>
</tr>
<tr>
<td style="text-align: center">Must Provide valid S3 URI or S3 Object</td>
<td style="text-align: center">1. You are referencing a private S3 Object with no Bucket Policy</td>
</tr>
<tr>
<td style="text-align: center"> </td>
<td style="text-align: center">2. You are trying to do a Function within the S3 Location call</td>
</tr>
<tr>
<td style="text-align: center"> </td>
<td style="text-align: center">3. You are trying to reference a S3 Object which has no valid CloudFormation items</td>
</tr>
</tbody>
</table>
<h2 id="scope-and-use-in-practice">Scope and Use In Practice</h2>
<p>When using Includes, its very important to pay attention to how Scope is in use. You cannot have two Includes calls in a single Scope level. I have detailed interesting use cases around scope below and its important all of these items are <strong>mutually exclusive</strong> within a single template (#1) or within sections (#2/#3).</p>
<h5 id="1---use-the-transform-section-of-the-template">1 - Use the transform section of the template</h5>
<p>In this example I show how you can replace an entire template with Includes except Parameters, which can not be part of Includes.</p>
<p><em>Master Template</em></p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml">---
AWSTemplateFormatVersion: 2010-09-09
Transform:
Name: AWS::Include
Parameters:
Location: s3://my-partials-bucket/my_stack.yaml
# Parameters cannot be in an Includes
Parameters:
MyParam:
Type: String</code></pre></figure>
<p><em>Partials Template</em></p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml">Metadata:
... # Your Metadata
Conditionals:
... # Your Conditionals
Mappings:
... # Your Mappings
Resources:
... # Your Resources
Outputs:
... # Your Outputs</code></pre></figure>
<h5 id="2---a-section-level">2 - A section level</h5>
<p>In this example I show how you can take all of the Resources, Outputs, etc of a template and put them into a Partial</p>
<p><em>Master Template</em></p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml">Mappings:
Fn::Transform:
Name: AWS::Include
Parameters:
Location: s3://my-partials-bucket/mappings/my_mappings.yaml
# Parameters cannot be in an Includes
Parameters:
MyParam:
Type: String
...
Resources:
Fn::Transform:
Name: AWS::Include
Parameters:
Location: s3://my-partials-bucket/resources/my_resources.yaml
Outputs:
Fn::Transform:
Name: AWS::Include
Parameters:
Location: s3://my-partials-bucket/outputs/my_outputs.yaml</code></pre></figure>
<p><em>Partials Templates</em></p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"># my_mappings.yaml
AWS::CloudFormation::Interface:
ParameterGroups:
-
Label:
default: Global Account Information
Parameters:
- MyParam
ParameterLabels:
# These values must be quoted to add white space
MyParam:
default: 'My Parameter: '</code></pre></figure>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"># my_resources.yaml
MyLogicalKMSResourceName:
Type: AWS::KMS::Key
Properties:
Description: |
My KMS Example Resource
Enabled: true
...
MyLogicalWaitResourceName:
Type: AWS::CloudFormation::WaitConditionHandle</code></pre></figure>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"># my_outputs.yaml
MyLogicalKMSResourceOutput:
Description: |
KMS ARN Example
Value:
Ref: MyLogicalKMSResourceName
Export:
Name:
Fn::Sub: ${AWS::StackName}-MyLogicalKMSResourceOutput</code></pre></figure>
<h5 id="3---multiple-resources">3 - Multiple Resources</h5>
<p>In this example I show you can use excludes to abstract the body of each resource and output into their own Partial</p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml">...
Parameters:
MyParam:
Type: String
...
Resources:
MyLogicalKMSResourceName:
Fn::Transform:
Name: AWS::Include
Parameters:
Location: s3://my-partials-bucket/resources/kms_key.yaml
MyLogicalWaitResourceName:
Fn::Transform:
Name: AWS::Include
Parameters:
Location: s3://my-partials-bucket/resources/wait_handle.yaml
...
Outputs:
MyLogicalKMSResourceOutput:
Fn::Transform:
Name: AWS::Include
Parameters:
Location: s3://my-partials-bucket/outputs/kms_key.yaml
...</code></pre></figure>
<p><em>Partials Templates</em></p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml">#resources/kms_key.yaml
Type: AWS::KMS::Key
Properties:
Description: |
My KMS Example Resource
Enabled: true
...</code></pre></figure>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml">#resources/kms_key.yaml
Type: AWS::CloudFormation::WaitConditionHandle</code></pre></figure>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml">#outputs/kms_key.yaml
Description: |
KMS ARN Example
Value:
Ref: MyLogicalKMSResourceName
Export:
Name:
Fn::Sub: ${AWS::StackName}-MyLogicalKMSResourceOutput</code></pre></figure>
<h5 id="4---within-a-resource">4 - Within a resource</h5>
<p>This example shows how to use Includes to provide some modularity to the end-user while maintaining some attributes which are common in the Partial. When you get to this method, as the warning below notates scope is very tricky and you should take care. In the example I create an RDSDBParameterGroup which allows the Master Template to specify what the Parameters are in use for this RDS. Additionally, I show a second method of advanced scoping which allows you to have an Includes at the end of a section which provides any “generic” items, as a result of scoping, any use of this must be at the end of the Section or it will be overridden. I also demonstrate that regardless of how a resource is declared (e.g. RDSDBParameterGroup is “within a resource”) the Logical ID persists in the template after compilation (e.g the Output for RDSDBParameterGroup is in the Generic Outputs Includes)</p>
<p><strong>Note: This method is considered advanced and requires significant testing</strong></p>
<p><em>Master Template</em></p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml">Mappings:
...
# Parameters cannot be in an Includes
Parameters:
MyParam:
Type: String
...
Resources:
RDSDBParameterGroup:
Type: AWS::RDS::DBParameterGroup
Properties:
# Common items for all Parameter Groups
Fn::Transform:
Name: AWS::Include
Parameters:
Location: s3://my-partials-bucket/resources/rds_parameter_group.yaml
# Custom Parameters per RDS which are set by the "stack owner"
Parameters:
sql_mode: IGNORE_SPACE
timezone: UTC
# Must be the last item in the section
# Includes a series of generic resources
Fn::Transform:
Name: AWS::Include
Parameters:
Location: s3://my-partials-bucket/resources/general_resouces.yaml
Outputs:
MyLogicalKMSResourceOutput:
Fn::Transform:
Name: AWS::Include
Parameters:
Location: s3://my-partials-bucket/outputs/kms_key.yaml
# Must be the last item in the section
# Includes a series of generic outputs
Fn::Transform:
Name: AWS::Include
Parameters:
Location: s3://my-partials-bucket/outputs/general_outputs.yaml</code></pre></figure>
<p><em>Partials Templates</em></p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"># resources/rds_parameter_group.yaml
Description: RDS DB parameter group
Family: rdsengine-12.3
Tags:
-
Key: Name
Value: rds-engine-12.3-parameter-group</code></pre></figure>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"># resources/general_resouces.yaml
MyLogicalKMSResourceName:
Type: AWS::KMS::Key
Properties:
Description: |
My KMS Example Resource
Enabled: true
...
MyLogicalWaitResourceName:
Type: AWS::CloudFormation::WaitConditionHandle</code></pre></figure>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"># outputs/kms_key.yaml
Description: |
KMS ARN Example
Value:
Ref: MyLogicalKMSResourceName
Export:
Name:
Fn::Sub: ${AWS::StackName}-MyLogicalKMSResourceOutput</code></pre></figure>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"># outputs/general_outputs.yaml
RDSDBParameterGroupOutput:
Description: |
A RDS Parameter Group Example
Value:
Ref: RDSDBParameterGroup
Export:
Name:
Fn::Sub: ${AWS::StackName}-RDSDBParameterGroup</code></pre></figure>
<h2 id="lessons-learned">Lessons Learned</h2>
<ul>
<li>Numbers on convert are not maintained with precision (13.00 becomes 13.0)</li>
<li>When debugging, make a “fat” template if you run into issues. Many “good errors” are hidden by Change Sets and Includes and therefore if you run into weird issues, make a template with everything in it first and then break it out once its working.</li>
<li>Make a test blank stack (see above) when trying to develop as a rollback rolls back normally preventing continual stack create/delete cycles</li>
<li>Validate all your YAML first and not through a CloudFormation YAML linter, the stricter the YAML linter the better</li>
<li>When in doubt, use cfn-flip and see if its still works</li>
<li>Develop IAM policies in IAM first, then use a JSON to YAML converter to embed into your templates</li>
<li>New features have new bugs, you may call support and want to hit your head on your desk when its something “simple”, but those occurrences are outweighed by the times its a true bug</li>
</ul>
<h3 id="references">References</h3>
<p><em>Announcement: https://aws.amazon.com/about-aws/whats-new/2017/03/aws-cloudformation-supports-authoring-templates-with-code-references-and-amazon-vpc-peering/</em></p>
<p><em>AWS Documentation: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/create-reusable-transform-function-snippets-and-add-to-your-template-with-aws-include-transform.html</em></p>
<p><a href="/articles/making-modular-cloudformation-with-includes/">Making Modular CloudFormation with Includes</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on May 08, 2017.</p>
/articles/using-aws-athena-to-query-cloudtrail-logs2017-01-26T04:00:00+00:002017-01-26T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<h2 id="athena-and-cloudtrail-a-marriage-made-in-the-cloud">Athena and CloudTrail: A Marriage made in the Cloud</h2>
<p>One of the first things which came to mind when AWS announced AWS Athena at re:Invent 2016 was querying CloudTrail logs. Over the course of the past month, I have had intended to set this up, but current needs dictated I had to do it quickly. When I went looking at JSON imports for Hive/Presto, I was quite confused. Of course, as a trusty technologist I went to Google. Much to my surprise, no one had published an article about using Athena to do this, I was only able to locate EMR based posts which used a custom serde to support the nested CloudTrail format.</p>
<p>I had mild success at first, but thanks to some Athena guru’s, I was able to get the magic piece in place.</p>
<p>I have to provide credit to AWS for their help with a few issues and amazing documentation on the event types.</p>
<p>I have provided references at the end of each field section and the end of the post with specific and broader details for the event fields and their uses.</p>
<p>To set all of this up, you first must have your CloudTrail logs in a single S3 bucket, this will work with a single account or many, but I purposely set up delivery to a single bucket but I created a table per source in Athena under a common database.</p>
<p>This is an example create table which will provide the table/field sytax formats I used in the tables below.</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql">CREATE EXTERNAL TABLE my_table_name (
Records ARRAY< STRUCT< eventName: STRING,
requestParameters: STRUCT< instancesSet: STRUCT< items: ARRAY< STRUCT< instanceId: STRING >>>,
volumeSet: STRUCT< items: ARRAY< STRUCT< volumeId: STRING > > > >,
eventType: STRING,
eventSource: STRING,
sourceIPAddress: STRING,
userIdentity: STRUCT< arn: STRING,
principalId: STRING,
accountId: STRING,
invokedBy: STRING,
TYPE: STRING,
sessionContext: STRUCT< sessionIssuer: STRUCT< arn: STRING,
principalId: STRING,
accountId: STRING,
TYPE: STRING,
userName: STRING >,
attributes: STRUCT< creationDate: STRING,
mfaAuthenticated: STRING > > >,
eventVersion: STRING,
responseElements: STRUCT< credentials: STRUCT< accessKeyId: STRING,
expiration: STRING,
sessionToken: STRING >,
assumedRoleUser: STRUCT< arn: STRING,
assumedRoleId: STRING > >,
userAgent: STRING,
eventID: STRING,
awsRegion: STRING,
sharedEventID: STRING,
eventTime: STRING,
resources: ARRAY< STRUCT< accountId: STRING,
TYPE: STRING,
ARN: STRING > >,
requestID: STRING,
recipientAccountId: STRING >>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH serdeproperties( 'ignore.malformed.json' = 'true' )
LOCATION 's3://my_consolidated_bucket/my-cross-account-prefix/AWSLogs/'</code></pre></figure>
<h2 id="cloudtrail-record-query-columns">CloudTrail Record Query Columns</h2>
<p>These are the columns you can reference in your queries, I have grouped them by purpose. This is not a full list of all CloudTrail fields, so if you need others such as VpcEndpoint, you should add that to the schema.</p>
<p><strong>Event ID Fields</strong></p>
<table>
<tr>
<td>record.eventID</td>
<td>GUID generated by CloudTrail to uniquely identify each event</td>
</tr>
<tr>
<td>record.sharedEventID</td>
<td>GUID generated by CloudTrail to uniquely identify CloudTrail events from the same AWS action that is sent to different AWS accounts</td>
</tr>
</table>
<p><strong>Event Details</strong></p>
<table>
<tr>
<td>record.eventName</td>
<td>The requested action, which is one of the actions in the API for that service. (example: DescribeLoadBalancers)</td>
</tr>
<tr>
<td>record.eventSource</td>
<td>The service that the request was made to (e.g. ec2.amazonaws.com)</td>
</tr>
<tr>
<td>record.eventTime</td>
<td>The date and time the request was made, in coordinated universal time (UTC)</td>
</tr>
<tr>
<td>record.eventType</td>
<td>Identifies the type of event that generated the event record, one of AwsApiCall, ConsoleSignin, AwsServiceEvent (related to the trail itself, this can occur when another account made a call with a resource that you own)</td>
</tr>
<tr>
<td> record.eventVersion</td>
<td>The version of the log event format</td>
</tr>
<tr>
<td> record.sourceIPAddress</td>
<td>The IP address that the request was made from, when console is used, it will report console.amazonaws.com</td>
</tr>
</table>
<p><strong>Request Details</strong></p>
<table>
<tr>
<td>record.requestId</td>
<td>The value that identifies the request, generated by the service being called</td>
</tr>
<tr>
<td>record.requestParameters</td>
<td>The parameters, if any, that were sent with the request</td>
</tr>
</table>
<p><strong>Resource Details</strong></p>
<table>
<tr>
<td>record.resources</td>
<td>An array of the resources accessed in the event, used most often by STS or KMS
</td>
</tr>
<tr>
<td>record.resources.accountId</td>
<td> The account ID of the impacted element</td>
</tr>
</table>
<p><strong>Response Details</strong></p>
<table>
<tr>
<td>record.responseElements.assumedRoleUser.arn</td>
<td>The arn of the assumed role for the unique session</td>
</tr>
<tr>
<td>record.responseElements.assumedRoleUser.assumedRoleId</td>
<td>The ID of the assumed role for the unique session</td>
</tr>
<tr>
<td>record.responseElements.credentials.accessKeyId</td>
<td>The access key of the caller
</td>
</tr>
<tr>
<td>record.responseElements.credentials.expiration</td>
<td>The expiration of the current session</td>
</tr>
<tr>
<td>record.responseElements.credentials.sessionToken</td>
<td>The active token for the session
References
</td>
</tr>
</table>
<p><em>References</em> <br />
<em>http://docs.aws.amazon.com/IAM/latest/UserGuide/cloudtrail-integration.html#stscloudtrailexample</em>
<em>http://docs.aws.amazon.com/kms/latest/developerguide/logging-using-cloudtrail.html</em></p>
<p><strong>Miscellaneous</strong></p>
<table>
<tr>
<td>record.userAgent</td>
<td>The agent through which the request was made
</td>
</tr>
<tr>
<td>record.recipientAccountId</td>
<td>Represents the account ID that received this event, may differ from the calling account if cross-account access occurred and will differ on the "remote" end</td>
</tr>
</table>
<p><strong>User Identity</strong></p>
<table>
<tr>
<td>record.userIdentity.accountId</td>
<td>The account that owns the entity that granted permissions for the request</td>
</tr>
<tr>
<td>record.userIdentity.arn</td>
<td>The Amazon Resource Name (ARN) of the principal that made the call</td>
</tr>
<tr>
<td>record.userIdentity.invokedBy</td>
<td>The name of the AWS service if that made the request
</td>
</tr>
<tr>
<td>record.userIdentity.principalId</td>
<td>A unique identifier for the entity that made the call. For requests made with temporary security credentials, this value includes the session name that is passed to the AssumeRole, AssumeRoleWithWebIdentity, or GetFederationToken API call</td>
</tr>
<tr>
<td>record.userIdentity.sessionContext.attributes.creationDate</td>
<td>The date and time when the temporary security credentials were issued</td>
</tr>
<tr>
<td>record.userIdentity.sessionContext.attributes.mfaAuthenticated</td>
<td>The value is true if the root user or IAM user whose credentials were used for the request also was authenticated with an MFA device; otherwise, false</td>
</tr>
<tr>
<td> record.userIdentity.sessionContext.sessionIssuer.accountId</td>
<td>The account that owns the entity that was used to get credentials
</td>
</tr>
<tr>
<td>record.userIdentity.sessionContext.sessionIssuer.arn</td>
<td>The internal ID of the entity that was used to get credentials
</td>
</tr>
<tr>
<td>record.userIdentity.sessionContext.sessionIssuer.type</td>
<td>The source of the temporary security credentials, such as Root, IAMUser, or Role</td>
</tr>
<tr>
<td>record.userIdentity.sessionContext.sessionIssuer.userName</td>
<td>The friendly name of the user or role that issued the session. The value that appears depends on the sessionIssuer identity type. See reference material for more information</td>
</tr>
<tr>
<td>record.userIdentity.type</td>
<td>The type of the identity which is one of: Root, IAMUser, AssumedRole, FederatedIsr AWSAccount (cross-account access), AWSService (Access performed by an AWS service such as Elastic Beanstalk)</td>
</tr>
</table>
<p><em>Reference: http://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-event-reference-user-identity.html#cloudtrail-event-reference-user-identity-fields</em></p>
<p><strong>Example Queries</strong></p>
<p>Find all event names by ARN by IP address and count them up as the highest totals</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql">SELECT record.eventName, record.userIdentity.arn, record.sourceIPAddress, COUNT(*)
FROM
(SELECT record
FROM my_table_name
CROSS JOIN UNNEST(records) AS t (record)) AS records
GROUP BY record.eventName, record.userIdentity.arn, record.sourceIPAddress
ORDER BY COUNT(*) DESC
LIMIT 20;</code></pre></figure>
<p>Find all events where cross-account access occurred, group them by the source and the ARN and count the totals</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql">SELECT record.eventName, record.eventSource, record.userIdentity.arn, COUNT(*)
FROM
(SELECT record
FROM my_table_name
CROSS JOIN UNNEST(records) AS t (record)) AS records
WHERE record.recipientAccountId <> record.userIdentity.accountId
GROUP BY record.eventName, record.eventSource, record.userIdentity.arn
ORDER BY COUNT(*) DESC
LIMIT 20;</code></pre></figure>
<p><em>Document Reference: http://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-event-reference.html</em></p>
<p><a href="/articles/using-aws-athena-to-query-cloudtrail-logs/">Using AWS Athena to Query CloudTrail Logs</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on January 26, 2017.</p>
/articles/ntp-peerstats-status-word-secret-decoder-ring2014-05-16T04:00:00+00:002014-05-16T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>In /var/log/ntpstats/peerstats you see lines like (I put them in table form for readability - normally space delimited)</p>
<table class="table table-striped">
<thead>
<tr>
<th>Day</th>
<th>Seconds</th>
<th>Peer IP</th>
<th>Peer Status Word</th>
<th>Offset</th>
<th>Delay</th>
<th>Dispersion</th>
<th>Skew (variance)</th>
</tr>
</thead>
<tbody>
<tr>
<td>56791</td>
<td>36043.625</td>
<td>10.39.32.12</td>
<td>8023</td>
<td>-0.000106166</td>
<td>0.000316335</td>
<td>7.946282622</td>
<td>0.000000119</td>
</tr>
<tr>
<td>56791</td>
<td>36824.626</td>
<td>10.39.32.11</td>
<td>9034</td>
<td>0.000068454</td>
<td>0.000453367</td>
<td>7.937500123</td>
<td>0.000000119</td>
</tr>
<tr>
<td>56791</td>
<td>36839.626</td>
<td>10.39.32.12</td>
<td>9034</td>
<td>-0.000027949</td>
<td>0.000240638</td>
<td>7.937500121</td>
<td>0.000000119</td>
</tr>
<tr>
<td>56791</td>
<td>37082.626</td>
<td>10.39.32.12</td>
<td>9034</td>
<td>-0.000047201</td>
<td>0.000307433</td>
<td>3.938467683</td>
<td>0.000115655</td>
</tr>
<tr>
<td>56791</td>
<td>37108.626</td>
<td>10.39.32.11</td>
<td>8023</td>
<td>-0.000128532</td>
<td>0.000392425</td>
<td>7.937500122</td>
<td>0.000000119</td>
</tr>
<tr>
<td>56791</td>
<td>37110.626</td>
<td>10.39.32.12</td>
<td>9034</td>
<td>-0.000071405</td>
<td>0.000344577</td>
<td>3.937507683</td>
<td>0.000057128</td>
</tr>
<tr>
<td>56791</td>
<td>37112.626</td>
<td>10.39.32.11</td>
<td>8023</td>
<td>-0.000142907</td>
<td>0.000267320</td>
<td>1.937515213</td>
<td>0.000051571</td>
</tr>
<tr>
<td>56791</td>
<td>38177.626</td>
<td>10.39.32.12</td>
<td>964a</td>
<td>-0.000114107</td>
<td>0.000233899</td>
<td>0.007741648</td>
<td>0.000038685</td>
</tr>
</tbody>
</table>
<h3>Decoding Peer Status Word</h3>
<p><br />
The Peer Status Word is a multi-byte disaster which represents things to NTP. Normally ntpq -c “associations” shows you this in english but to debug you need to look at the peerstats file which does not reveal.</p>
<h4>Decoding the first byte</h4>
<p><br />
Using the table below you get the value. For instance 9XXX status means 10+80 (reachable and configured in ntp.conf). 8XXX means It is configured but it is not reachable (80).</p>
<table class="table table-striped">
<thead>
<tr>
<th>Code</th>
<th>Message</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>08</td>
<td>bcst</td>
<td>broadcast association</td>
</tr>
<tr>
<td>10</td>
<td>reach</td>
<td>host reachable</td>
</tr>
<tr>
<td>20</td>
<td>authenb</td>
<td>authentication enabled</td>
</tr>
<tr>
<td>40</td>
<td>auth</td>
<td>authentication ok</td>
</tr>
<tr>
<td>80</td>
<td>config</td>
<td>persistent association</td>
</tr>
</tbody>
</table>
<h4>Decoding the second byte</h4>
<p><br />
Given 96XX the host is reachable and configured (as seen above) the second byte from the table below means it is the system peer (you will also see a * next to it in ntpq -p)</p>
<p>Another example from the sample above is "8<strong>0</strong>23" this means it is configured but it is NOT reachable and it is discarded (see bolded 0 meaning sel_reject)</p>
<table class="table table-striped">
<thead>
<tr>
<th>Code</th>
<th>Message</th>
<th>T</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>sel_reject</td>
<td> </td>
<td>discarded as not valid (TEST10-TEST13)</td>
</tr>
<tr>
<td>1</td>
<td>sel_falsetick</td>
<td>x</td>
<td>discarded by intersection algorithm</td>
</tr>
<tr>
<td>2</td>
<td>sel_excess</td>
<td>.</td>
<td>discarded by table overflow (not used)</td>
</tr>
<tr>
<td>3</td>
<td>sel_outlyer</td>
<td>-</td>
<td>discarded by the cluster algorithm</td>
</tr>
<tr>
<td>4</td>
<td>sel_candidate</td>
<td>+</td>
<td>included by the combine algorithm</td>
</tr>
<tr>
<td>5</td>
<td>sel_backup</td>
<td>#</td>
<td>backup (more than tos maxclock sources)</td>
</tr>
<tr>
<td>6</td>
<td>sel_sys.peer</td>
<td>*</td>
<td>system peer</td>
</tr>
<tr>
<td>7</td>
<td>sel_pps.peer</td>
<td>o</td>
<td>PPS peer (when the prefer peer is valid)</td>
</tr>
</tbody>
</table>
<h4>Decoding the third and fourth byte</h4>
<p><br />
The third byte is the count of occurrences of the 4th byte (event code) and yes that seems backwards to normal thought</p>
<p>The fourth byte as seen below is the event code which is counted by the third byte.</p>
<p>Given the example of "941a" from above we know its configured and reachable its a candidate and there has been one occurrence of becoming a system peer (1 times of a)</p>
<p>Another example is "8023" from above we know its configured but its unreachable and there has been two occurrences of it being unreachable (2 times of 3).</p>
<table class="table table-striped">
<thead>
<tr>
<th>Code</th>
<th>Message</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>01</td>
<td>mobilize</td>
<td>association mobilized</td>
</tr>
<tr>
<td>02</td>
<td>demobilize</td>
<td>association demobilized</td>
</tr>
<tr>
<td>03</td>
<td>unreachable</td>
<td>server unreachable</td>
</tr>
<tr>
<td>04</td>
<td>reachable</td>
<td>server reachable</td>
</tr>
<tr>
<td>05</td>
<td>restart</td>
<td>association restart</td>
</tr>
<tr>
<td>06</td>
<td>no_reply</td>
<td>no server found (ntpdate mode)</td>
</tr>
<tr>
<td>07</td>
<td>rate_exceeded</td>
<td>rate exceeded (kiss code RATE)</td>
</tr>
<tr>
<td>08</td>
<td>access_denied</td>
<td>access denied (kiss code DENY)</td>
</tr>
<tr>
<td>09</td>
<td>leap_armed</td>
<td>leap armed from server LI code</td>
</tr>
<tr>
<td>0a</td>
<td>sys_peer</td>
<td>become system peer</td>
</tr>
<tr>
<td>0b</td>
<td>clock_event</td>
<td>see clock status word</td>
</tr>
<tr>
<td>0c</td>
<td>bad_auth</td>
<td>authentication failure</td>
</tr>
<tr>
<td>0d</td>
<td>popcorn</td>
<td>popcorn spike suppressor</td>
</tr>
<tr>
<td>0e</td>
<td>interleave_mode</td>
<td>entering interleave mode</td>
</tr>
<tr>
<td>0f</td>
<td>interleave_error</td>
<td>interleave error (recovered)</td>
</tr>
</tbody>
</table>
<p><a href="/articles/ntp-peerstats-status-word-secret-decoder-ring/">NTP Peerstats Status Word Secret Decoder Ring</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on May 16, 2014.</p>
/articles/using-your-macs-screen-remotely-without-people-watching2012-04-03T04:00:00+00:002012-04-03T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>Long since property of Apple Remote Desktop Enterprise, the ability to remote into your computer without leaving the screen on for all to see has finally shown up, but its in 10.7 only.</p>
<p>Open terminal and run open vnc://yourcomputername</p>
<div>After that is open, click "View" in the menu bar and the click "Switch to Virtual Display". This turns off your iMac screen and allows you to work without watching or playing pranks.</div>
<div>In ARD it was called "Curtain Viewing", I'm very happy to see this finally become a much needed item for everyone</div>
<p><a href="/articles/using-your-macs-screen-remotely-without-people-watching/">Using Your Mac's Screen Remotely Without People Watching</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on April 03, 2012.</p>
/articles/new-site2012-03-08T05:00:00+00:002012-03-08T05:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>Yup it is about that time again, I got bored and redesigned my site. Hopefully this will get me to write more posts. You can expect to see posts about AWS, Mobile, General Tech Items, or anything else I think is important enough to put into the permanence of cyber-space. Stay tuned...</p>
<p><a href="/articles/new-site/">New Site</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on March 08, 2012.</p>
/articles/adding-vmnets-in-vmware-fusion-42011-09-14T04:00:00+00:002011-09-14T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>With the release of VMWare Fusion 4 (and its CONTINUED lack of GUI for network manager), I bring you the instructions on how to add networks to VMWare Fusion 4 (now that I can write about it).</p>
<p>In good news, you no longer have to fully restart the network stack via boot.sh, just restarting Fusion will dynamically pick up the changes</p>
<p><a id="more"></a><a id="more-102"></a></p>
<p>All network configuration files are now found in /Library/Preferences/VMware\ Fusion</p>
<p>The networking file contains information about the VMNET's and will be where you do most of your configuration.</p>
<p>For example if you want to create a VMNET4 and have no DHCP and host only networking, you would append to the networking file</p>
<p> </p>
<pre><code>answer VNET_4_DHCP no
answer VNET_4_HOSTONLY_NETMASK 255.255.255.0
answer VNET_4_HOSTONLY_SUBNET 172.16.128.0
answer VNET_4_VIRTUAL_ADAPTER yes
</code></pre>
<p>Now you HAVE to edit the .vmx of your VM directly.</p>
<p>Something along the lines of:</p>
<pre><code>ethernet0.connectionType = "custom"
ethernet0.vnet = "vmnet4"
ethernet0.bsdName = "vmnet4"
ethernet0.displayName = "Custom Host Only VMnet4"
</code></pre>
<p>Special things to note, do NOT use the GUI network selector after you do this and your network will always be grey, don't worry.</p>
<p>If you want to create a NAT'ed network with now DHCP you would do the same as the above, however, the easiest way to set it up would be copy vmnet8/ to vmnet#/ and remove the dhcpd.conf and dhcpd.conf.bak. Edit the nat.conf to have the appropriate subnet and vmnet (also at the bottom you can create port forwarding if you would like). Also edit the nat.mac and update it to something with the same 3 first groupings and change the last three to something unused on your system.</p>
<p><a href="/articles/adding-vmnets-in-vmware-fusion-4/">Adding VMNet's in VMWare Fusion 4</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on September 14, 2011.</p>
/articles/security-in-the-cloud2011-09-13T04:00:00+00:002011-09-13T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>As many of you know I am a very big proponent of using the cloud with high automation. At my job we do this in a big way. However, one question always comes to mind, if you share your servers with other physical machines; how can one guarantee security?</p>
<p>In short you can't, but there are things you can do.</p>
<p>When I say you "can't" guarantee security, I mean that anyone could find a "local" exploit in the hypervisor, just look <a href="http://goo.gl/xtWkd" target="_blank">here</a> if you think hypervisors are secure. </p>
<p>Then you must be thinking to yourself, I am basically screwed and will never be able to deploy in the cloud. You would be wrong. The high automation I prefer using (e.g. puppet) reports on changes to files which you control and you control what servers you automate via the PKI system. For very important systems, things like OSSEC are great tools.</p>
<p>One thing people should be very aware of is what Amazon calls "Security Groups". They are akin to a virtual firewall and even site between servers of the same layer. So by nature, if you have two servers in one security group, they will be unable to SSH to one another or even ping unless you explicitly allow it. That was a fantastic decision by Amazon and really helps security engineers facilitate a more secure environment.</p>
<p>Amazon in particular, in my opinion, has done an excellent job looking at security from the worst-case scenario. You can go from as locked down as a VPC cluster, to as open as a wide-open security group. It is your choice how good or bad at security you are. Also, since the Elastic Load Balancer product belongs to a security group, you get an extra layer of firewalls by only letting those load balancers to direct access the web servers (removing some external DDoS attacks). Amazon Web Services has also been verified as a PCI Level 1 Service Provider. I can say from experience that it is a very difficult thing to do and an extremely big commitment from Amazon do to on such a massive scale.</p>
<p>In a future posts I will write about how to best architect a cloud system for minimal failures and how to put your worst fears onto paper in the very important Recovery Time Objective/Recovery Point Objective Disaster Recovery document from a cloud setting.</p>
<p><a href="/articles/security-in-the-cloud/">Security in the 'Cloud'</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on September 13, 2011.</p>
/articles/ipv6-redux2011-04-14T04:00:00+00:002011-04-14T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>I have 6to6 capability at my house, yet I had noticed the lynch pin in fast 6to6 browsing in local dns resolution. I have an Airport Extreme, but it refuses to hand out public DNS servers which I put it in, and it runs its own version of the dscaching daemon built into OSX. This daemon in OSX works as intended, but on the ABES 6to6 resolution can take over 15 seconds. Doing directly lookups by forcing external native v4 and v6 DNS servers in my Mac's DNS server configuration (airport in this case), eliminates the problem. </p>
<p>All the Googling I have done shows there is no way to force the ABES to give out designated DNS servers via DHCP and for it NOT to handle DHCP (but to do 4to6 and 4to4 NAT). This is a pitty and something Apple should address, or get its resolver more IPv6 savvy. </p>
<p><a href="/articles/ipv6-redux/">IPv6 Redux</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on April 14, 2011.</p>
/articles/ipv62011-02-19T05:00:00+00:002011-02-19T05:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>We all have heard quite a bit about IANA running out of IPv4 addresses. While it will be a while until the affects are fully felt, I am doing my part and adding AAAA records to my website. IPv6 is the future whether we are ready or not, there is no time like the present to start thinking about it.</p>
<p><a href="/articles/ipv6/">IPv6</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on February 19, 2011.</p>
/articles/puppet-continuous-integration2010-08-13T04:00:00+00:002010-08-13T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>Puppet is an amazing configuration management system as I have previously written, but one downfall is that no system exists where you check in code, it runs, and if it fails, it alerts. Continuous Integration is a very important thing to have. It saves dev and production environments from being destroyed or otherwise screwed up. After searching all over the web, I was unable to find anyone who has done a full CI system for puppet, so I developed my own.<br />
My CI system consists of 3 parts: The Foreman, XenServer, and Git. I have a cron job which runs every 5 minutes and pulls down the latest code from the "central" git server. I have multiple people merging into this server from multiple companies, for CI was a must-have for us.</p>
<p>If the code on the server is newer than the code on the client, I update the code and rsync it from its staging directory into /etc/puppet on the puppetmaster.</p>
<p>The biggest problem with puppet is that you can't just run syntax checks against the system. Puppet is a stateful system which requires catalogs to be run against the same types of servers that exist in your target environments. My answer? Virtualization.</p>
<p>I have 3 VM's running (one per a "role"), on my XenServer. Each one simulates how the systems are designed, named, and used in production.</p>
<p>After the run is kicked off and the CI sees new changes and they are applied, you need to be able to figure out what worked, what didn't, and get alerted as to the breakage. The Foreman comes into play. <a href="http://theforeman.org">The Foreman</a>, for those who don't know, is a Web UI to Puppet reports. It can perform many other functions like complete unattended kickstart installs, but that is not what I needed it for. The Foreman runs and analyzes runtime analytics as well as states. If for some reason the puppet run fails on the client, it will immediately email the failure to a mailing list I have setup.</p>
<p>The system has been tested and I have only encountered one problem which I have opened a bug with The Foreman team on, it will not detect puppetmaster catalog compile errors.</p>
<p>All in all, this system will allow multiple sysadmins to be committing and working on various modules at the same time and the code to be validated in an automated fashion. Still, as a good practice as it is, a CI system should never fully replace a set of second human eyes.</p>
<p><a href="/articles/puppet-continuous-integration/">Puppet Continuous Integration</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on August 13, 2010.</p>
/articles/cron-job-to-ensure-your-puppet-clients-stay-happy2010-07-22T04:00:00+00:002010-07-22T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>I wrote a Perl script which is used in combination with cron to make sure that puppet clients don't stray too far from their master. The script can be found <a href="http://files.thomasvachon.com/share/puppet-last-parse">here</a> and is available under the GPL v3.</p>
<p><a href="/articles/cron-job-to-ensure-your-puppet-clients-stay-happy/">Cron Job to Ensure Your Puppet Clients Stay Happy</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on July 22, 2010.</p>
/articles/adding-vmnets-in-vmware-fusion-32010-02-03T05:00:00+00:002010-02-03T05:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>This problem has come up a couple times and I figured out how to do it. It isn't a pretty thing to do, but it works.<br />
<a id="more"></a><a id="more-74"></a><br />
First, open your terminal and go to /Library/Application\ Support/VMware\ Fusion/</p>
<p>sudo ./vmnet-apps.sh --stop</p>
<p>If you want a host only net, cp -R the vmnet1 folder, if you want a NAT network cp -R the vmnet8 folder. Name the new folder vmnetX where X is your new network name.</p>
<p>Edit the files inside. There is a dhcpd.conf which must be changed to suit your needs, if it is a nat network there is a nat.conf and a nat.mac. Change these to match the dhcpd.conf changes in networks.</p>
<p>Now edit the networking file. If you want a host only network, copy the VNET1 entries, if you want NAT copy the VNET8 entries. Paste and modify the entries to match your vmnet folder's #.</p>
<p>Now delete the VNET_X_DHCP_CFG_HASH line, it will auto-regenerate.</p>
<p>Edit the lines to match the network etc. If you want the Mac to NOT have a connection (aka a self contained vm network) set VNET_X_VIRTUAL_ADAPTER to no</p>
<p>Now run sudo ./vmnet-apps.sh --start</p>
<p>Do an ifconfig and make sure your new vmnet is up and correctly configured.</p>
<p>Now go into ~/Documents/Virtual\ Machines.localized/</p>
<p>cd into your VM you want to mess with (note ADD THE ADAPTERS IN THE UI FIRST)</p>
<p>Modify the .vmx of the guest</p>
<p>From <a href="http://sanbarrow.com/vmx/vmx-network.html">source</a>:</p>
<p>For a VM with VMware tools installed:<br /></p>
<pre><code>ethernet0.present= "true"
ethernet0.startConnected = "true"
ethernet0.virtualDev = "vmxnet"
ethernet0.connectionType = "custom"
ethernet0.vnet = "vmnetX"
</code></pre>
<p>For a VM without VMware tools:<br /></p>
<pre><code>ethernet0.present= "true"
ethernet0.startConnected = "true"
ethernet0.virtualDev = "e1000"
ethernet0.connectionType = "custom"
ethernet0.vnet = "vmnetX"
</code></pre>
<p><a href="/articles/adding-vmnets-in-vmware-fusion-3/">Adding VMNet's in VMWare Fusion 3</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on February 03, 2010.</p>
8.0]]>/articles/how-to-upgrade-a-cisco-pix-515-with-serial-failover-from-6-3-8-02009-10-03T04:00:00+00:002009-10-03T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>Well it sounds simple doesn't it? Cisco says you reload the OS, you make a couple changes and voila, you have a working Pix 515 running the latest and greatest code (which by the way is the same code those ASA's run which cost quite a bit more). Well, not so fast.<a id="more"></a><a id="more-68"></a></p>
<p>First of all, make sure you meet the requirements for running anything over 6.3. This means a 515 or higher Pix (I do not recommend 515 but 515e's as the minimum as the code is much heavier and the Pentium II in the 515's are slower). Also, you need to have enough room on your flash. If you don't use the god forsaken Pix Device Manager (which by all accounts no one ever should) you are fine. Finally, you need RAM. Luckily, as long as you are not covered my SmartNet, feel free to crack open your Pix to reveal its true nature. It runs a Intel motherboard and PC-100 RAM. It supports a maximum of 256 MB (2x128mb) and RAM is cheap so go for it and upgrade it to the max. One caveat, is that you MUST run an unrestricted license to support 256 MB of RAM. I was able to upgrade a restricted version (as 128 MB is the minimum, but I soon found its flash chip was fried and bought a replacement 515e off the used market).</p>
<p>Ok so you pass the pre-req's. Now what to do, well you need 2 separate OS images. You need 7.2 and an 8.0 or greater. They are available on Cisco's website for registered users. Also, you need to make sure if you are stateful failover you have a free ethernet interface or sub-interface for replication (which I haven't done yet). </p>
<p>Now on to the procedure, Cisco's website is a little fuzzy on how to do this on a pair of failover 515's so this will be of the best use to you. This is certainly a maintenance window activity as doing it incorrectly will cause arp poisoning and other awfulness. </p>
<p>First, BACK UP YOUR CONFIG! (not that this has to be said) Then disconnect the serial cable between the two Pix's. Start the upgrade on the Primary Pix (the one with the Primary side of the Serial cable). Upgrade via: copy tftp: flash: from 6.3 to 7.2. The Pix will start complaining about re-writing rules, this is ok right now. One you are at the prompt, write your config and reboot again. From here you can now go to 8.0 via: copy tftp: flash:image.bin</p>
<p>Reboot the Pix again and you will be in 8.0. You may get some warnings about stateful failover (how to solve that hopefully coming later). Any other warnings should be looked and and confirmed as ok or fixed. Errors must be fixed at this point as well. Now comes the tricky part. For every interface which has a standby IP associated re-input the ip address line without the standby ip. Also make sure ALL failover lines are gone. Save your config and now its time to move to the second Pix.</p>
<p>This time the upgrade starts off a bit differently. Make sure the serial cable is disconnected (as it already should be) and write erase. You want a blank config for this. Reload and do the same 6.3 -> 7.2 (don't bother saving the config this time) and then 7.2 -> 8.0. At this time write erase again to be sure its a clean Pix. Power off the Secondary Pix and connect the serial cable on both ends. </p>
<p>Now put your additions back on your ip address lines (yes, you have to type it all out) and wr your config. Now do a show fail. It should report partner is powered off. This is correct as it should be. Finally in configure mode type "failover" on the Primary Pix. Boot up your Secondary Pix and go into configure and type "failover". Magically you should see "show fail" pair up and start replicating the conf over the serial link to the blank standby unit.</p>
<p>Once everything is up and good, you have upgraded from 6.3->8.0 and now have almost all the features of an ASA. This is a very worthwhile activity as it gives you a huge bump in features and ease of use. Once I get stateful failover working on a subinterface/Trunk, I will post how to finish off the job. However, do heed Cisco's warnings, doing stateful failover using a data bearing interface is NOT supported, it will not nat, blow away your acl's and every reference to that interface, just don't try it.</p>
<p>I hope this helps your upgrade go smoother than ours did (its only a mild concussion the doctor says from hitting our heads against the wall so much)</p>
<p><a href="/articles/how-to-upgrade-a-cisco-pix-515-with-serial-failover-from-6-3-8-0/">How to Upgrade a Cisco Pix 515 With Serial Failover From 6.3 -> 8.0</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on October 03, 2009.</p>
/articles/dance-puppets-dance2009-05-04T04:00:00+00:002009-05-04T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>What an odd title? What has to deal with puppets and system administration? Well, in fact there is a program called puppet. What is puppet? Puppet is a client/server software system put out by Reductive Labs which allows for simplistic management of *Nix (OS X included).</p>
<p><a id="more"></a><a id="more-26"></a></p>
<p>"So it manages things, but I can do that with some custom scripts." Well that is a sentiment I have run into in my current position, but it was quickly overcome when shown how puppet differs from other in-house scripts. Most in-house systems are a mix of OSS software combined into a single usable instance. For instance, using rsync to sync up scripts to a list of servers, and ssh to execute and read back any output. While that works on a homogenized environment where all conventions are followed, but what happens if someone spins up a new site where they didn't or more likely could not follow such conventions. This is the problem I found myself in. I inherited a EC2 environment and due to the way EC2 servers are built and the OS version we had to run, none of our pre-made scripts would work. Thus begins my adventures in configuration management.</p>
<p>My first experiment was similar to our in-house system which was a combination of rsync and SSH systems which synced up the root file system to a copy of it on the "admin" server. It then would execute any necessary commands via ssh. The "system" did have package management, using a combination of dpkg --get-selections and ssh commands, but it was far from easily manageable and required a run per a server type. Overall, while the system worked, it was far from scaleable.</p>
<p>Thus began my adventure of looking at configuration management solutions. The three that stood out were cfengine, puppet, and bfcfg2. While cfengine and puppet are the closest related, there are some significant differences in design/philosophy which set them apart. Most importantly, you have to clearly define different OS'es in cfengine which puppet just handles. More info can be found <a href="http://reductivelabs.com/trac/puppet/wiki/CfengineVsPuppet">here</a>. bfcfg2 was not selected for a variety of reasons. It lacks a good way to bundle up servers into classes, which some workarounds have been developed. More importantly, its configuration language is not easily understandable by multiple people. It is written in XML and would require extensive comments to make it clear to a multi-person operations team.</p>
<p>This left me with puppet. Puppet comes highly recommended by several sources including Digg and Google, both of whose recommendations are not easy to come by. Digg likes to use some OSS software but build upon it. They did not do this for puppet which is a testament of its features and flexibility. Google uses puppet to manage all their linux desktops and will be expanding it in the near term to whole data centers. So what is so great about puppet? Some of the best things is its mutliplatform abilities, code re-usability, and just plain readability/codeability.</p>
<p>Ok, so you've ranted about puppet, what's the big deal? I can do this type of stuff in my sleep. Sure you can, but can you install a package on hundreds of servers in a matter of minutes? If you have scripts pre-written, no problem. What if you want to install a new server and make its package versions match EXACTLY every other server and you only have 1 hour to do it, well you are pretty certainly screwed, unless you have a puppet system set up. A timed install of a server using apache2, rails, mod_rails, and about 15 other gems, takes my puppet install all of 10 minutes. This is the beauty of puppet.</p>
<p>So what does the configuration look like? We'll its very similar to cfengine, as puppet is an outgrowth, but it also runs on Ruby on Rails, so you have the power of ERB templating at your fingertips. Let's start with a simple script to ensure SSH is installed and running on boot and its configuration files are in place.</p>
<pre><code>File: ssh.pp
class sshd{
file { "/etc/ssh/sshd_config":
owner => root,
group => root,
mode => 0444,
source => "puppet:///files/sshd_config"
notify => Service["ssh"]
}
file{ "/etc/ssh/ssh_config":
owner => root
group => root
mode => 444
source => "puppet:///files/ssh_config"
notify => Service["ssh"]
}
service{ ssh:
ensure => running
}
}
</code></pre>
<p>Wow, thats kinda cool right, but what does it all mean? Well puppet is broken down into 3 major building blocks, the node, which is the server entry in a file called nodes.pp; the class, which is a container for a bunch of stuff; and finally the resource, which is the meat and potatoes of the system. The resources, as seen in this example are the "file" resource and the "service" resource.</p>
<p>The file resource can take a bunch of options, the most interesting one here is source. the puppet:/// tells the client (which parses these manifests) to look at a embedded webrick server (which the puppetmaster runs) and grab the file from there. It then places it at the file path specified in the first line of the resource. The notify line says, if this file is updated, restart ssh.</p>
<p>The service (which has to be in the same class as any notify declarations) in this case says that it has to be running on boot. Puppet is intelligent enough to know if something has to be running on boot, it also needs to be installed. Note, the package name HAS to match the installable version for you OS. Now you say, well how is THAT portable, well watch this, change your service delcaration to match</p>
<pre><code>service{ ssh:
name => operatingsystem ? {
'debian' => "ssh",
'centos' => "sshd"
},
ensure => running
}
</code></pre>
<p>This queries the underlying OS and installs the appropriate package. You can do similar things for the “path” attribute if you need the ssh configurations in a different spot.</p></p>
<p>So that is a quick overview of how puppet works. I will be going more in-depth on how I have chosen to deploy it in a later post.</p>
<p><a href="/articles/dance-puppets-dance/">Dance Puppets, DANCE!</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on May 04, 2009.</p>
/articles/where-has-the-time-gone2009-04-12T04:00:00+00:002009-04-12T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>Wow, I certainly have not updated this in forever. I should get back to this, most likely with a better theme soon. I need to update the resume as well. Stay tuned...</p>
<p><a href="/articles/where-has-the-time-gone/">Where Has the Time Gone...</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on April 12, 2009.</p>
/articles/why-linux2008-05-08T04:00:00+00:002008-05-08T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p>Linux is now rapidly becoming the operating system of choice in many core areas of business. It is transforming information technology in many exciting ways from being used in products ranging from cell phones and PDAs to cars and mainframe computers. In addition to being cost-effective, it is constantly being updated and refined with the latest technologies. As Linux gains greater acceptance in todays Information and Communication Technology, more and more companies are supporting Linux both application and hardware compatibility.<a id="more"></a><a id="more-10"></a></p>
<p>Like its many uses, Linux has a variety of printed and electronic guides to show you what to do. The specialist guides are highly detailed focusing on narrow areas of excellence. The encyclopedic guides for beginners focus on Linux fundamentals and then only introduce you to more specialized topics. Everyone can start learning this spectacular and versatile Operating System from beginner users to having the confidence of an expert.</p>
<p>Types of Linux:</p>
<p>Distribution PCLinuxOS - (Desktop Linux)<br />
Home Page <a href="http://www.pclinuxos.com/">http://www.pclinuxos.com/</a><br />
Mailing Lists <a href="http://docs.mypclinuxos.com/Mailing-list">http://docs.mypclinuxos.com/Mailing-list</a><br />
Documentation <a style="text-decoration: none;" href="http://docs.pclinuxos.com/">http://docs.pclinuxos.com/</a></p>
<p>Distribution Ubuntu - (Desktop/Server Linux)<br />
Home Page <a href="http://www.ubuntu.com/">http://www.ubuntu.com/</a><br />
Mailing Lists <a href="http://lists.ubuntu.com/mailman/listinfo/">http://lists.ubuntu.com/mailman/listinfo/</a><br />
Documentation <a style="text-decoration: none;" href="https://wiki.ubuntu.com/UserDocumentation">https://wiki.ubuntu.com/UserDocumentation</a></p>
<p>Distribution<br />
openSUSE - (Desktop Linux)<br />
Home Page <a href="http://www.opensuse.org/">http://www.opensuse.org/</a><br />
Mailing Lists <a href="http://en.opensuse.org/Communicate/Mailinglists">http://en.opensuse.org/Communicate/Mailinglists</a><br />
Documentation <a style="text-decoration: none;" href="http://en.opensuse.org/Documentation">http://en.opensuse.org/Documentation</a></p>
<p>Distribution Fedora Project - (Desktop Linux)<br />
Home Page <a href="http://fedoraproject.org/">http://fedoraproject.org/</a><br />
Mailing Lists <a href="http://fedoraproject.org/wiki/Communicate">http://fedoraproject.org/wiki/Communicate</a><br />
Documentation <a href="http://docs.fedoraproject.org/">http://docs.fedoraproject.org/</a><span style="color: #000000;"><span style="text-decoration: none;"> <a href="http://docs.fedoraproject.org/"> </a></span></span><a style="text-decoration: none;" href="http://fedoraproject.org/wiki/Docs">http://fedoraproject.org/wiki/Docs</a></p>
<p>Distribution Debian GNU/Linux - (Desktop/Server Linux)<br />
Home Page <a href="http://www.debian.org/">http://www.debian.org/</a><br />
Mailing Lists <a href="http://lists.debian.org/">http://lists.debian.org/</a><br />
Documentation <a style="text-decoration: none;" href="http://www.debian.org/doc/">http://www.debian.org/doc/</a></p>
<p>Distribution Mandriva Linux - (Desktop Linux)<br />
Home Page <a href="http://www.mandrivalinux.com/">http://www.mandrivalinux.com/</a><br />
Mailing Lists <a href="http://www.mandriva.com/en/mailing_lists">http://www.mandriva.com/en/mailing_lists</a><br />
Documentation <a href="http://www.mandriva.com/en/community/users/documentation">http://www.mandriva.com/en/community/users/documentation</a></p>
<p>Distribution CentOS - (Server Linux)</p>
<p>Home Page <a href="http://www.centos.org/">http://www.centos.org/</a><br />
Mailing Lists <a href="http://www.centos.org/modules/tinycontent/">http://www.centos.org/modules/tinycontent/</a><br />
Documentation <a href="http://www.centos.org/docs/">http://www.centos.org/docs/</a></p>
<p>Distribution KNOPPIX - (Desktop Linux)</p>
<p>Home Page <a href="http://www.knoppix.com/">http://www.knoppix.com/</a><br />
Mailing Lists <a href="http://lists.debian.org/debian-knoppix/">http://lists.debian.org/debian-knoppix/</a><br />
Documentation <a href="http://www.knoppix.net/docs/">http://www.knoppix.net/docs</a></p>
<p><a href="/articles/why-linux/">Why Linux?</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on May 08, 2008.</p>
/articles/restricting-login-in-linux2008-05-08T04:00:00+00:002008-05-08T04:00:00+00:00Thomas Vachoncontactme@thomasvachon.com
<p><span>When we talk about forcing a user to log off, what we're really talking about is time restrictions on certain account system access and services. The easiest way I've found to implement time restrictions is by using software called Linux-PAM
<p><span>Pluggable Authentication Module (PAM) is a mechanism for authenticating users. Specifically, we're going to use the pam_time module to control timed access for users to services.
<p><a id="more"></a><a id="more-11"></a></p>
<p><span>Using the pam_time module, we can set access restrictions to a system and/or specific applications at various times of the day as well as on specific days. Depending on the configuration, you can use this module to deny access to individual users based on their name, the time of day, the day of week, the service they're applying for, and their terminal from which they're making the request.
<p><span>When using pam_time, you must terminate the syntax or rule in the /etc/security/time.conf file with a newline.
<p><span>Always remeber that pound sign [#] is a comment and the system will ignore that text inline to it.
<p>This is an example configuration file for the pam_time module.<br />
The syntax of the lines is as follows:</p>
services;ttys;users;times
<ol type="1">
<li>The first field services = list of PAM service names.</li>
<li>The second field tty = logic list of terminal names.</li>
<li>The third field user = is a logic list of users or a netgroup of users.</li>
<li>The fourth field times = indicates the applicable times.</li><br />
</ol><br />
<span>Here's an example of a typical set of rules:</span></p>
<p><span>login ; \* ; !root ; 0800-2000</span></p>
<p>http ; \* ; !root; 0800-2000</p>
<p><span>ftp ; \* ; !root; 0800-2000</span></p>
<p><span><br />
</span></p>
<p>These rules restrict user ron from logging on between the hours of 0800 and 2000. They also restrict http and ftp access during these hours.</p>
<p>Root would be able to logon at any time and browse the Internet during all times as well.</p>
</span></p></span></p></span></p></span></p></span></p>
<p><a href="/articles/restricting-login-in-linux/">Restricting Login in Linux</a> was originally published by Thomas Vachon at <a href="">Thomas Vachon</a> on May 08, 2008.</p>