[{"content":" Header photo by Roman Synkevych 🇺🇦 on Unsplash\nNOTE: This blog posts follows on from my original post back in 2023 where I first set this up. You can find the original blog post GitHub Actions using AWS and OpenID by using the link.\nBefore we begin Please make sure to read my original blog post, it has been updated with the new steps, but this post goes on to specifically call out the changes that allowed this to happen.\nWhat changed? A few months beyond my original blog post, GitHub and AWS updated the way that they authenticate between each other, by removing the need to pin the certificate thumbprint as part of the OIDC authentication process. GitHub Blog Post. AWS added GitHub to one if its many root certificate authorities (CAs), meaning GitHub can update their authentication certificate without the need for us to update the thumbprint that existed in the setup step of the OIDC setup. AWS Documentation\nFor a while the AWS SDK and the Terraform AWS Provider had not been updated, so still required a thumbprint to be added to the OIDC provider to allow Terraform to create the resource. Thankfully, in December, AWS updated their main Go SDK, which allowed the owners of the Terraform AWS Provider to make their change to make the Thumbprints Optional.\nNOTE For existing implementations, this might be harder to work in, due to a quirk with the AWS API and the way Terraform works. There is a bug issue open for this. Essentially Terraform will not update the API if there is no default setting, clearing out the Thumbprint list with no default means Terraform will not update the API, so the thumbprints are not removed. As the default behaviour for AWS OIDC configurations that are trusted by AWS means that it ignores the thumbprint list, this should not cause any issues with existing OIDC setups. This is only an issue with Terraform, and not the AWS API.\nCreation of the OpenID Connect Provider Setting up the Identity Provider (IdP) will need to be the updated. The walkthrough documentation, has also been updated:\nThe provider URL - in the case of GitHub this is https://token.actions.githubusercontent.com The \u0026ldquo;Audience\u0026rdquo; - which scopes what can use this. Confusingly in Terraform this is also known as the client_id_list. You no longer need to generate the thumbprint Adding the resource in Terraform Now that we have the only two bits we need, it is easy enough to amend / change your Terraform code to set up the OpenID Connect Provider in IAM. This code example below has been updated to use the only bits you need, removing the need for the old thumbprint_list.\n1 2 3 4 5 6 resource \u0026#34;aws_iam_openid_connect_provider\u0026#34; \u0026#34;github\u0026#34; { url = \u0026#34;https://token.actions.githubusercontent.com\u0026#34; client_id_list = [ \u0026#34;sts.amazonaws.com\u0026#34; ] } Everything else inside the original blog post still stands.\nWhat about other OpenID Connect Providers? There are only a limited number of providers that AWS have on their root certification CA approved list. Therefore you will need to read the documentation on each provider to find out. If you are still using another OIDC provider with AWS, and need to use the thumbprint, then the usual steps still apply. Below is an updated version of the steps needed for a 3rd party OIDC provider.\nYou will need to ensure you have the following\nThe provider URL - this is normally supplied by your third party, and should be a public address The \u0026ldquo;Audience\u0026rdquo; - which scopes what can use this. Once again in Terraform this is also known as the client_id_list. The Thumbprint of the endpoint - This one is the tricker one, as you will need to generate this yourself. Generating the thumbprint For this example, you will need to ensure that you have downloaded the OIDC provider certificates, or have them to hand to generate the thumbprint. Ideally, if you can get the thumbprint from the provider, that would mean you can skip all the steps.\nIf you know the url that needs to be used, then you can use the following openssl command like before. 1 openssl s_client -servername tokens.endpoint.oidc.provider.com -showcerts -connect tokens.endpoint.oidc.provider.com:443 Grab the certificate shown in the output, you will see this starting with -----BEGIN CERTIFICATE-----, then place this content into a file. For this demo, I will use openid.crt.\nUse the OpenSSL command again to generate the fingerprint from the file created above.\n1 openssl x509 -in openid.crt -fingerprint -sha1 -noout Which should output the fingerprint. Strip away all the extra parts, and the : between each of the pairs of hexadecimal characters, and you should end up with something like this:\n1 6938fd4d98bab03faadb97b34396831e3780aea1 ⚠️ Note: You will need to make the letters lowercase, as Terraform is case sensitive for the variable we need to put this in, but AWS is not case sensitive, so it can sent Terraform into a bit of a loop. ⚠️\nAdding the resource in Terraform With all of this, you can now start to put together the Terraform elements. We can use the Terraform resource for setting up the OpenID Connect Provider in IAM. Below is a code example, using the information gathered from the documentation and the thumbprint generation, and all placed into a single resource object.\n1 2 3 4 5 6 7 8 9 resource \u0026#34;aws_iam_openid_connect_provider\u0026#34; \u0026#34;other_oidc_provider\u0026#34; { url = \u0026#34;https://tokens.endpoint.oidc.provider.com\u0026#34; client_id_list = [ \u0026#34;sts.amazonaws.com\u0026#34; ] thumbprint_list = [ \u0026#34;6938fd4d98bab03faadb97b34396831e3780aea1\u0026#34; ] } Round up While this is a shorter than normal blog post, this should hopefully show the differences between the two types now. Once we get an update on how the bug regarding existing OIDC setups work, I am sure I will post another update!\nAny questions, please let me know!\n","date":"2025-01-12T14:32:19Z","image":"https://static.colinbarker.me.uk/img/blog/2023/02/roman-synkevych-wX2L8L-fGeA-unsplash.jpg","permalink":"https://colinbarker.me.uk/blog/2025-01-12-github-actions-oidc-update/","title":"GitHub Actions and OIDC Update for Terraform and AWS"},{"content":" Header photo by Brittany Colette on Unsplash\nWhat was the issue? While I would love to take all of the credit for this, my old squad working with this customer had to deal with this issue for a very long time, till we got our head together and figured out what was going on! Jake Morgan put most of this together into a more visual and documented sense for the customer so I am hoping to use what I remember from back then to put this together.\nThe issue we had, a customer using CNAMEs to point generic host names to key services within their network, were having major issues resolving the names once they had migrated to AWS. For this I will have to explain the scenario in a little more detail, the domain used in this example is one of my own - and something you should be able to test yourself with your own account if you so wish!\nUltimately during a migration, we needed to move a service from On-Premise to AWS, in doing so - it\u0026rsquo;s IP would change, but we only had one top level record for running this service. Switching the IP was, a little harder than expected, so here we go into a bit more detail as to what the domain was and what it entailed.\nThe domain For this example, we are going to be using the acmeltd.co.uk domain. One of my personal domains that I use for random testing and development, bought when I had to use a domain for Active Directory, but over the years has become a little underused! For this to work correctly, the authoritative domain records can be found at:\nns1.faereal.net ns2.faereal.net Here the root domain sits, and where most of the \u0026ldquo;original\u0026rdquo; configuration will come from. Here we will have a top level entry of service1.acmeltd.co.uk to represent a service hosted somewhere in our environment.\nThe delegation What our customer originally had setup, was not quite best practice, but this is why we had come into migrate them into AWS! However, this can show how the issue can occur.\nHere we have to \u0026ldquo;pretend\u0026rdquo; that we have a DNS server on premise, in this example we will be using ns3.internal.faereal.net - this entry doesn\u0026rsquo;t exist, so it will always fail, but for our customer - this was pointing to a DNS server on-premise with a local non-internet routable IP.\nThe delegation we will use will be region.prod.acmeltd.co.uk - a regional production zone that will be initially hosted on-premise.\nThe service Here is where we can go back to our service above. Internally, the service can be referenced by the DNS record service1.region.prod.acmeltd.co.uk of which we can pretend that this an A record that points to 192.168.100.10. This works fine on-premise when looking up. The next part of our example, the top level service will be a CNAME record, pointing to a record specifically hosted on our internal DNS server. service1.acmeltd.co.uk will be a CNAME record, pointing to service1.region.prod.acmeltd.co.uk. This means anyone looking up service1.acmeltd.co.uk will be pointed to the internal DNS server, where the record will be resolved to 192.168.100.5.\nThe Route53 Private DNS zone used in this example The migration Usually, the easiest option here would be to use a Route53 Outbound Resolver however, in this instance - it doesn\u0026rsquo;t work as expected. So for the moment, we can say that this is in place, but we can ignore it for the moment.\nTo try and get around the issue, we tried to setup a Route53 Private Hosted Zone to match the zone that is currently held on premise. region.prod.acmeltd.co.uk, in an attempt to localise the zone inside the VPC, in the hopes that this would resolve the query issues. Within this, we included a specific A record that points to a local IP inside AWS 10.10.10.10 as an example.\nThe expected DNS resolution pathway So with all in place, we were expecting the following:\nWithin AWS\nservice1.acmeltd.co.uk -\u0026gt; CNAME -\u0026gt; service1.region.prod.acmeltd.co.uk -\u0026gt; Private Hosted Zone -\u0026gt; 10.10.10.10\nWithin On-Prem\nservice1.acmeltd.co.uk -\u0026gt; CNAME -\u0026gt; service1.region.prod.acmeltd.co.uk -\u0026gt; On-Premise DNS server -\u0026gt; 192.168.100.10\nThe outcome 1 2 [ec2-user@ip-10-10-10-12 ~]$ host service1.acmeltd.co.uk Host service1.acmeltd.co.uk not found: 2(SERVFAIL) Well, that didn\u0026rsquo;t work at all did it.\nTroubleshooting the DNS resolvers This is where we got stuck originally, DNS wouldn\u0026rsquo;t resolve, and we needed to get this working to ensure the migration worked. So we stepped through each resolver till we could see where the issue was.\nFrom an EC2 instance For our testing, we are going to be using a simple EC2 instance, this way we can check along the way. So lets look at where it resolves it\u0026rsquo;s DNS.\nThe EC2 instance we are testing with inside a VPC, with the associated Route53 Private DNS zone All VPC\u0026rsquo;s have their own DNS resolver build into it, specifically its on the second IP within each subnet, so if you had a subnet of 10.10.10.0/24 the DNS resolver would be at 10.10.10.2. For more information see the AWS documentation. Using the dig command we can see this in action, including the server it was looking at.\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [ec2-user@ip-10-10-10-12 ~]$ dig CNAME service1.acmeltd.co.uk ; \u0026lt;\u0026lt;\u0026gt;\u0026gt; DiG 9.18.28 \u0026lt;\u0026lt;\u0026gt;\u0026gt; service1.acmeltd.co.uk ;; global options: +cmd ;; Got answer: ;; -\u0026gt;\u0026gt;HEADER\u0026lt;\u0026lt;- opcode: QUERY, status: NOERROR, id: 24752 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;service1.acmeltd.co.uk. IN CNAME ;; ANSWER SECTION: service1.acmeltd.co.uk. 60 IN CNAME service1.region.prod.acmeltd.co.uk. ;; Query time: 0 msec ;; SERVER: 10.10.10.2#53(10.10.10.2) (UDP) ;; WHEN: Mon Dec 09 13:30:59 UTC 2024 ;; MSG SIZE rcvd: 86 As you can see, the dig looking up the CNAME record did pull back the CNAME record, but it hasn\u0026rsquo;t then continued the resolution onto getting the A record from anywhere. Very confusing, as the VPC resolver also has the private hosted zone region.prod.acmeltd.co.uk associated to it, so logically it should have picked it up. What you see, is why this isn\u0026rsquo;t the case.\nAWS\u0026rsquo;s External DNS resolver From the VPC resolver, it is only logical that the lookup of service1.acmeltd.co.uk would then head out to the two authoritative public DNS servers. To be able to do this, AWS themselves will have a resolver to connect out to the public internet for you. This is why on a private subnet, with a VPC resolver enabled, it is possible to resolve public DNS records without any access to the internet.\nExample of where the AWS External Resolver and the Authoritative DNS resolver sit in relation to a VPC As the AWS External Resolver isn\u0026rsquo;t authoritative for the acmeltd.co.uk domain, it would pass the query onwards to its DNS servers, in this case, the ns1.faereal.net and ns2.faereal.net servers mentioned before.\nThe Authoritative DNS Resolver Now that the query has been received by the authoritative DNS resolver, it finally can get the CNAME record, and this is what we saw from the server - it responded with the CNAME of service1.region.prod.acmeltd.co.uk. Which is then sent back to the AWS External Resolver, which then tries to look up that domain, and we hit an issue.\nDiagram showing the flow of what happens when the External Resolver tries to resolve the service1.acmeltd.co.uk hostname The AWS External Resolver already knows that the acmeltd.co.uk has the region.prod.acmeltd.co.uk record which is an NS record pointing to the private internal DNS server ns2.internal.faereal.net - this being an internal IP, means it can\u0026rsquo;t continue with the resolution, and will report back a SERVFAIL, and no record is resolved. The AWS External DNS resolver doesn\u0026rsquo;t have access to the VPC private networks, so it wouldn\u0026rsquo;t be able to resolve to the on-premise DNS servers. It\u0026rsquo;s the \u0026ldquo;knowing\u0026rdquo; part that causes the issue, as it won\u0026rsquo;t push the answer for the CNAME record back down the chain.\nAs one picture The full end to end flow of the DNS lookup As you can see, even with the private zone, the and even a specific Route53 outbound resolver in your VPC, this setup doesn\u0026rsquo;t work. How did we resolve this.\nThe resolution Of everything we tried, the only one that worked for this customer, was using a completely separate domain. Let\u0026rsquo;s see how this changes the setup.\nThe new domain For our example, we will be using a new domain brkr.io, but also creating a new root level record called service2.acmeltd.co.uk that we will CNAME - this is just so if you wanted to follow along and see this for yourself, the lookups will work for you!\nWith this new domain, we can get the original CNAME record answer to be pushed back into the AWS VPC Resolver instead, and it can then do the final step of the look up for us. So quickly, this is how we set this up:\nA new root level CNAME record has been setup service2.acmeltd.co.uk - In the real world, we changed the original service1.acmeltd.co.uk to point to the new domain record The CNAME pointed to service2.region.prod.brkr.io On-premise a new DNS zone was setup for region.prod.brkr.io to resolve services to local IP\u0026rsquo;s (192.168.100.10) within their on-premise setup A new Route53 Private Hosted Zone called region.prod.brkr.io was created with a record for service2.region.prod.brkr.io to point to 10.10.10.10 The output 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 ec2-user@ip-10-10-10-12 ~]$ dig service2.acmeltd.co.uk ; \u0026lt;\u0026lt;\u0026gt;\u0026gt; DiG 9.18.28 \u0026lt;\u0026lt;\u0026gt;\u0026gt; service2.acmeltd.co.uk ;; global options: +cmd ;; Got answer: ;; -\u0026gt;\u0026gt;HEADER\u0026lt;\u0026lt;- opcode: QUERY, status: NOERROR, id: 20464 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;service2.acmeltd.co.uk. IN A ;; ANSWER SECTION: service2.acmeltd.co.uk. 54 IN CNAME service2.region.prod.brkr.io. service2.region.prod.brkr.io. 300 IN A 10.10.10.10 ;; Query time: 139 msec ;; SERVER: 10.10.10.2#53(10.10.10.2) (UDP) ;; WHEN: Mon Dec 09 14:02:29 UTC 2024 ;; MSG SIZE rcvd: 109 It worked!\nWhy did it work? For this, we will need to update our original diagram to show the flow, but the main reason is - the brkr.io domain in our example, wasn\u0026rsquo;t authoritative to the original DNS servers, so it needed to go back to \u0026ldquo;the start\u0026rdquo; and continue the resolution chain again. This allowed it to use the Route53 Private Hosted Zone for its lookup, but this would also work with a Route53 Outbound Resolver as well.\nThe final working flow, and it resolving to the correct address Summary DNS can be very easy, it can also be a compete nightmare to work out where everything is! For us, it was this mysterious \u0026ldquo;AWS External Resolver\u0026rdquo; which, once we had put it on paper, made complete sense as to why it was the issue - however not knowing it was there was part of the problem. Always check your DNS resolution chains to see where and more specifically how a resolver is getting an answer.\n","date":"2024-12-10T21:32:00Z","image":"https://static.colinbarker.me.uk/img/blog/2024/12/brittany-colette-GFLMi4c8XMg-unsplash.jpg","permalink":"https://colinbarker.me.uk/blog/2024-12-10-dns-on-aws-and-cnames/","title":"AWS DNS, Route53, CNAME records, and how it is resolved"},{"content":" Header photo by Jan Huber on Unsplash\nWhat is AWS Direct Connect? In this world of cloud technologies, and the idea that the cloud can solve all your problems, there is always a need to have some level of connection from an on-premise, or data centre location into AWS. Typically, the use of a VPN can be enough for most organisations, and it is a lot cheaper than AWS Direct Connect however, there will always be a wider security policy, or additional protections you need on your network. This is where AWS Direct Connect comes in.\nUnlike a VPN connection, which creates a private, encrypted tunnel between different networks and AWS, AWS Direct Connect can be seen in the most simplest terms, as plugging a network cable from a switch in your racks into a switch inside AWS that is connected directly into an AWS network. A layer 2 \u0026ldquo;wired\u0026rdquo; connection that you can push traffic you need through it. The connection never transits through the public internet, and is completely private (to a degree). It can reduce the number of hops over different internet routers to get to your AWS network, reducing the latency and improving the available bandwidth to and from your AWS resources, and depending on the size of the pipe you request from AWS, it could be faster than what is currently capable over a standard Site to Site VPN connection.\nIt uses a networking standard called 802.1Q VLAN tagging to be able to segment the traffic, by tagging traffic that transverses the network, switches can ensure that only the right ports see the right traffic. This is a very similar concept to the VLANs that you may be familiar with from your home network, same technology, just in a different context.\nThis post isn\u0026rsquo;t meant to be a complete re-hash of the documentation! If you would like to know more, then feel free to look at the AWS Direct Connect documentation. Instead, I will go through some of the gotchas that I have encountered with AWS Direct Connect while building a network for a customer.\nAllowed Prefixes for AWS Direct Connect gateways There is a whole page on the AWS documentation called Allowed prefixes interactions specifically for this, so I will talk about the specific gotcha that I encountered. The allowed prefixes act differently depending on the type of association you have linked your AWS network up to Direct Connect with. Virtual Private Gateway (VPG) or Transit Gateway (TGW), the list changes from a filter to a whitelist of what can be advertised over Direct Connect, and in the case I worked with, in both directions.\nExample of an AWS Direct Connect connection between a Transit Gateway and a Virtual Private Gateway Firstly, this is just an example of a wider customer example! It wasn\u0026rsquo;t quite set up this way, but this is summarised to show the potential of two different ways to connect your network to AWS. In reality, the customer did have something similar setup, but connected to two different types of setup, that we came in to resolve for them. For this we shall look at the Allowed Prefixes list across both association types.\nWhere to find the Allowed Prefixes list This one caught me out when looking around the interface, if you have never had to use AWS Direct Connect before, while you know it might exist from training, locating the list took a few minutes to find! On many occasions during customer calls I got myself lost looking for this, mainly because - it\u0026rsquo;s available in two locations!\nLocation of the Allowed Prefixes from the Direct Connection Gateways UI Location of the Allowed Prefixes from the Transit Gateways UI Both locations will send you to the same place, so don\u0026rsquo;t worry that you are editing one and need to amend the other.\nNOTE: Just like with any other BGP connection, making a change to this list will reset the BGP connection and re-advertise the prefixes in the list. This normally isn\u0026rsquo;t an issue, but sometimes this can cause a connectivity break if there is an issue somewhere else in the BGP sessions that exist. Just be aware of this if you need to make a change. Include it in any change request, or process you have to amend the list.\nAn example setup Simple example of two different groups of CIDRs, can we work out what will be advertised on site? In this example, we have listed essentially what the VPC\u0026rsquo;s are advertising into the different associations. This hasn\u0026rsquo;t covered the lower level examples of how this connects in, we shall assume that prorogation has been enabled properly on each, and both the Transit Gateway and the Virtual Private Gateway are receiving the correct prefixes.\nAllowed Prefixes list with a Transit Gateway Association Starting with an AWS Transit Gateway association, we had traffic being advertised over BGP to AWS, and the VPC\u0026rsquo;s being pushed back through to on-premise. We would need to be able to advertise network from the VPC connections, and On-Premise networks into the Transit Gateway. Remember, that essentially Direct Connect is one \u0026ldquo;router\u0026rdquo; and the Transit Gateway is another, linked with a \u0026ldquo;virtual cable\u0026rdquo;. The allowed prefix list in this case acts as both a filter and an announcer right in the middle of the two, but controlled from the Direct Connect side. This list acts as the CIDR\u0026rsquo;s that get advertised into the Transit Gateway. The documentation calls it out on this specific section, but it caught me out with my customer!\nFor this example, we are going to use an allowed prefix list that looks like this:\n10.0.0.0/16 172.26.0.0/20 192.168.0.0/20 100.70.0.0/24 With this prefix list and the above listed AWS advertised prefixes, this will show you what gets advertised on to the on-premise network. This works the opposite way around as well, for on-premise networks advertised into AWS however, for this example we will just go one way.\nAWS Advertised Prefix Allowed Prefix Entry On-Premise Received Prefix Notes 10.0.0.0/16 10.0.0.0/16 10.0.0.0/16 Simple example, what was advertised is what was received 172.26.0.0/16 172.26.0.0/20 172.26.0.0/20 Here we have a larger CIDR in AWS, but the prefix is smaller in the allowed prefix list, so the smaller prefix is what is received 192.168.0.0/24 192.168.0.0/20 192.168.0.0/20 Here we have a smaller CIDR in AWS, but the prefix is larger in the allowed prefix list, so the larger prefix is what is received 100.64.0.0/24 None None An example of the filter element working, while we have configured the AWS side to advertise the prefix, it is not allowed to be advertised over Direct Connect to on-premise None 100.70.0.0/24 100.70.0.0/24 This is where it can get a little complex, we have added the allowed prefix, but it isn\u0026rsquo;t being advertised on AWS -or- on premise. In this instance both sides of the association will receive the prefix, even though there is no network GOTCHA: Be careful when using the allowed prefix list with a Transit Gateway association, not to try and open up wider CIDR ranges than you need to, as this can have an unindented effect on the traffic that is advertised into the Transit Gateway and on premise.\nAllowed Prefixes list with a Virtual Private Gateway Association Moving onto the Virtual Private Gateway association, this connection uses pure filtering. If you are on the filter list, then you will be allowed to advertise, if you are not, then you can\u0026rsquo;t. The CIDRs advertised must be exact otherwise the filter will block it. This list will not actively announce any CIDR in the list, so you can\u0026rsquo;t use it to advertise non-existent or wider ranges to make \u0026ldquo;administration\u0026rdquo; easier later on! The AWS Documentation also calls this out but for me, this is where I was also caught out.\nFor this example, we are going to use an allowed prefix list that looks like this:\n10.100.0.0/24 172.16.0.0/20 192.168.0.0/20 With this prefix list and the advertised routes that exist in the AWS side, we will show you what gets advertised on the on-premise network. Just like before, this works the opposite way around as well, for on-premise networks advertised into AWS however, for this example we will just go one way.\nAWS Advertised Prefix Allowed Prefix Entry On-Premise Received Prefix Notes 10.100.0.0/24 10.100.0.0/24 10.100.0.0/24 Simple example, what was advertised is what was received, filter is allowed 172.16.0.0/16 172.16.0.0/20 None We h ave a wider CIDR in AWS, but the filter is of a smaller range, therefore this does not get advertised into the on-premise network 192.168.20.0/24 192.168.16.0/20 192.168.20.0/24 Here we have a smaller CIDR in AWS, and the filter is for a wider range, as the smaller prefix is within the larger prefix on the filter list, it will allow it through to the on-premise network 100.74.0.0/24 None None Simple example, this CIDR is not in th prefix list, so it is not allowed to be advertised GOTCHA: Here we can see that the filter list is not the same as before, different ranges get advertised in different ways. While this is talked a lot in the AWS documentation, getting the experience of using this with AWS Direct Connect is a little harder due to it\u0026rsquo;s specific usage and adoption within different customers.\nSummary The final outcome of what would be received by the Customer Gateway Sometimes it can feel counter intuitive on how the two types of allowed prefix lists work, but they are important in knowing the best way to configure a large organisations connection into AWS. Being able to see this in person is very hard to see due to the adoption of AWS Direct Connect, so keep this in mind the next time you run into an odd routing problem with the use of AWS Direct Connect.\n","date":"2024-10-21T14:52:00Z","image":"https://static.colinbarker.me.uk/img/blog/2024/10/jan-huber-4MDXq_aqHY4-unsplash.jpg","permalink":"https://colinbarker.me.uk/blog/2024-10-21-aws-direct-connect-allowed-prefix-lists/","title":"AWS Direct Connect Allowed Prefix lists - My \"gotchas\" with it!"},{"content":" Header photo by Sander Weeteling on Unsplash\nWhat is a \u0026ldquo;Zero Trust Network\u0026rdquo; To put plainly, this is a network that is created that by default has Zero Trust within it, it is based off the idea of a Zero Trust security model which is a specific type of implementation used across different networks.\nThe main concept behind the zero trust security model is \u0026ldquo;never trust, always verify\u0026rdquo;, which means that users and devices should not be trusted by default\n― Zero trust security model, Wikipedia So, how does this apply to networking? Well the main concept here is, that you create a network that by default disallows traffic that isn\u0026rsquo;t known, verified, or wanted. This can be expanded, to then have all your devices connected into a single network, that by default, only allows traffic to pass between each node if you accept it. This is a concept that is often referred to as a \u0026ldquo;Zero Trust Network\u0026rdquo;.\nExample of a simple network with switches and WiFi connection point Starting with a basic scenario, above we have a simple office, it has a Data Room with a load of servers, with a couple of Virtual Machines, some Head office computers, a physical server, and a mobile phone connected to a WiFi network. Using this as a base, everything had to be in the same place, but it was all on the same network - devices would talk to each other through the switch, and traffic didn\u0026rsquo;t need to jump over anything other than the switch (excluding the phone!).\nAs the business expanded, it migrated the on-premise data room to a data centre, the Head Office was moved to a new location, and people had the option of working from home. Additionally, the CTO has asked, \u0026ldquo;We need to move to AWS\u0026rdquo;. Networking before this would have been quite complex to set up.\nThis is where a Zero Trust Network comes into play.\nDepiction of a Zero Trust Network solution, with AWS hooked in Let\u0026rsquo;s take a look at the above example. Even though the Head Office move was completed, the data centre had been set up, and there were workloads in AWS, we can see here - the network is still directly connected. With this scenario, each segment is part of the wider network, rather than using a VPN connection to route traffic between them, you can see that it is possible to just talk from one side to another, with the only hop being the Zero Trust network in the middle. Think of the provider as essentially a switch!\nIn this post, I will concentrate primarily on the Zero Trust Appliance in AWS, and how we can connect to a Zero Trust Network.\nIntroduction to Tailscale Tailscale is one of the many different Software Defined, Zero Trust networking solutions that exist today. Many different providers have different ways of implementing their solutions, but they all are based on the same simple premise, \u0026ldquo;never trust, always verify\u0026rdquo;. Tailscale has multiple methods for adding devices to the network, by default, the Access Control Lists (ACLs) will not permit traffic between different devices unless explicitly states in the configuration of the ACL.\nFor anyone looking at behind the scenes of the technology used at Tailscale, I would recommend \u0026ldquo;How Tailscale Works\u0026rdquo; by Avery Pennarun who wrote how the data plan works, and a couple of examples as to why traditional VPNs might cause latency or even general issues in network connectivity.\nOur example, Tailscale to connect an Office to AWS Simply put, Tailscale will act as our \u0026ldquo;switch\u0026rdquo; and set up the point-to-point network between an EC2 instance, and an office where they might have several desktops.\nFor this post today, I will concentrate specifically on the Appliance that will sit in AWS, and how we would configure this for a customer.\nBuilding a Zero Trust Network Router on AWS For this, we will be using a specific set of tooling:\nTerraform v1.7.5 - An infrastructure as code tool AWS Provider - The translator between Terraform and the AWS API Tailscale Provider - The translator between Terraform and the Tailscale API Each \u0026ldquo;device\u0026rdquo; that is connected to Tailscale will need to be authenticated and approved before the device will be given access to the network. Even if there is an ACL in place that permits the access, the approval must occur. This can cause an issue when working with Infrastructure as Code (IaC), as the process would need to be automated. This can be overcome by generating a Tailnet Auth Key that is used specifically for the launching of the instance, and more specifically allowing your device to be added with \u0026ldquo;pre-approval\u0026rdquo;. For this, we will create a \u0026ldquo;tailnet key\u0026rdquo; resource.\n1 2 3 4 5 6 # Create an authorization token for the Tailscale router to add itself to the Tailnet resource \u0026#34;tailscale_tailnet_key\u0026#34; \u0026#34;tailnet_key\u0026#34; { preauthorized = true # Set to true to allow the pre-approval of the device expiry = 7776000 # Time in seconds description = \u0026#34;Tailnet key for the server\u0026#34; } Simply put, this generates the key as a resource, and then Terraform knows what the key is, and any other properties that will have been shown by the API call to create the key.\nWith this key, the next step would be to install the Tailscale client onto the EC2 instance and make it into a router for any services within AWS. This would need to be done in a few stages when working with IaC, so we should start right at the beginning.\nTailscale offers a very comprehensive guide on how to install the tailscale client on a vast number of devices, including Linux, macOS, Windows, iOS, Android, and even down to Amazon Fire Devices, and Chromebooks too! What we are doing is creating a Subnet Router.\nThe Subnet Router is slightly different on a Zero Trust Network like this, while it is usually recommended to install the client on every single server, client, and virtual machine in the organisation, sometimes - it\u0026rsquo;s not needed. Even more so for organisations that use the Cloud, and have a LOT of ephemeral devices, and where the Software Defined Network of the Cloud Provider already has the security in place that keeps your network Well-Architected.\nWithin Terraform, there is a function called templatefile that can be used to generate strings or blocks of text that can use variables that are generated from within Terraform and then used within the string or block of text. Here, we are using the output of the tailscale_tailnet_key resource above, and pushing the generated key value into the user_data for an AWS EC2 Instance, to run a script on the first run. Below is the template used to install Tailscale.\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 #!/bin/bash ## Set the hostname of the server hostnamectl hostname ${hostname} ## Ensure that the server is up to date with all the current packges DEBIAN_FRONTEND=noninteractive sudo apt update -y DEBIAN_FRONTEND=noninteractive sudo apt upgrade -y ## Enable IP Forwarding on the router to ensure that packets will flow as required echo \u0026#39;net.ipv4.ip_forward = 1\u0026#39; | sudo tee -a /etc/sysctl.d/99-tailscale.conf echo \u0026#39;net.ipv6.conf.all.forwarding = 1\u0026#39; | sudo tee -a /etc/sysctl.d/99-tailscale.conf sudo sysctl -p /etc/sysctl.d/99-tailscale.conf ## Install Tailscale from the source curl -fsSL https://tailscale.com/install.sh | sh ## Start Tailscale with the authorisation key to add it to the network sudo tailscale up --authkey=${ts_authkey} --advertise-routes=${local_cidrs} --accept-routes There is quite a bit happening in the script, it can be summarised as follows:\nSet the hostname for the instance - While this is an EC2 instance, and usually in the cloud you would probably use more ephemeral devices, a Subnet Router will need to act more like a \u0026ldquo;pet\u0026rdquo; so that Tailscale sees this as an appliance object. Setting the hostname will mean that it is recognisable as to which host this is within your Tailscale network Update the instance using apt - Pretty simple, make sure the instance is running the latest updates and patches before continuing! Set IP Forwarding on the device - This is a key step, especially so in Linux, as this is to enable the EC2 instance to take traffic that it receives and forward it on. This specific setting tells the networking device at a kernel level to route traffic through it, as by default this is switched off. Install Tailscale - The primary install of Tailscale, taken from the latest version on the Tailscale site. Startup tailscale WITH the keys - Here we finally get to see where our tailscale key will be used - using the up function, we can set the authkey generated before, as well as which routes to advertise. We will go into this in a second, but for now, this is where the key will go. Simple enough script, now we have to move on to creating the EC2 instance that will run\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 # Use a data object to get the latest version of Ubuntu data \u0026#34;aws_ami\u0026#34; \u0026#34;ubuntu\u0026#34; { most_recent = true filter { name = \u0026#34;name\u0026#34; values = [\u0026#34;ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-arm64-server-*\u0026#34;] } filter { name = \u0026#34;virtualization-type\u0026#34; values = [\u0026#34;hvm\u0026#34;] } owners = [\u0026#34;099720109477\u0026#34;] } # Build the Tailscale router using Ubuntu 22.04 LTS resource \u0026#34;aws_instance\u0026#34; \u0026#34;this\u0026#34; { # General setup of the instance using the Ubuntu AMI ami = data.aws_ami.ubuntu.id ebs_optimized = true instance_type = \u0026#34;t4g.small\u0026#34; disable_api_termination = true # Key for SSH Access (if required) key_name = try(var.ssh_key_name, null) # Run the user_data for this instance to install Tailscale user_data = templatefile(\u0026#34;${path.module}/templates/tailscale-install.sh.tpl\u0026#34;, { ts_authkey = tailscale_tailnet_key.this.key hostname = var.hostname local_cidrs = var.local_cidrs } ) # Networking Settings source_dest_check = false # Disabled to allow IP forwarding from the network private_ip = var.private_ip ipv6_addresses = var.ipv6_addresses subnet_id = var.subnet_id vpc_security_group_ids = [aws_security_group.this.id] .... (snipped) } A lot happening in this section, but in this example, we are using AWS Graviton (ARM-based) instances, as they are known to be a lot more efficient than other processor types, they can be cheaper, but also currently on AWS\u0026rsquo;s Free Tier for the time being!\nThe data object for the aws_ami - This will look for the latest version of the Ubuntu 22.04 image that exists in the Canonical account. Note the arm64 element of the filter string to look for the ARM version of the AMI. The aws_instance resource to generate the subnet router - The standard for any EC2 instance, the primary resource with all its configuration The general setup of the EC2 resource - Here we are using the data.aws_ami.ubuntu.id to ensure the right AMI is set, additionally making sure that api_termination has been configured, as we don\u0026rsquo;t want someone accidentally deleting the instance! An SSH key - Some people might want to ensure they have an SSH key so they can log into the instance, in several cases this might not be needed. User data template - This part is where we are taking the template bash file above, and taking the variables it knows about and entering details of what Terraform knows. Here we can see the ts_authkey variable is being set to the previously created tailscale_tailnet_key resource. Additional settings are also entered here. Network Settings - This one contains one key variable that MUST be set for any EC2-based router that is created. The source_dest_check variable must be set to false. The source/destination checking is on by default, and within the software-defined VPC network on AWS, it will ensure that the traffic that sees, is the traffic for it - when it acts as a router, it will expect to see traffic pass through it from different devices that it needs to forward on. This check is disabled to allow this to happen. Without it, there is no way for traffic destined for another part of the network to flow to the EC2 instance. Once this resource is launched, then it will appear in your Tailscale network, be approved, and should also be able to route traffic to and from your AWS network.\nSome other elements that make up the design for this router, for example, you will need Security Groups set up to ensure traffic is allowed inbound and outbound to the EC2 instance and networks. To make this whole process easier, I created a Terraform Module that does all of this for you!\nhttps://github.com/mystcb/terraform-aws-tailscale-router\nIn Part 2 of this blog post series, I will go into how to use this module to create a Multi-AZ version of this set up, and also include the changes I will be made to the module to enable Instance Recovery if the Operating System stops responding.\nExamples of other Zero Trust Networking Solutions - CloudFlare Zero Trust - This has more recently had an update that will allow access through one of its WARP clients. This is still in beta so one to keep an eye on. - Enclave - This was my first ever experience with Zero Trust networking, so I can\u0026rsquo;t not name-drop this one! While personally, I don\u0026rsquo;t use it anymore, it was here I learnt the basics of the Zero Trust network before moving myself to Tailscale.\n","date":"2024-04-12T18:41:00Z","image":"https://static.colinbarker.me.uk/img/blog/2024/04/sander-weeteling-KABfjuSOx74-unsplash.jpg","permalink":"https://colinbarker.me.uk/blog/2024-04-12-zero-trust-network-router-on-aws/","title":"Zero Trust Network Router on AWS"},{"content":" Header photo by Pascal Meier on Unsplash\nDoes one size fit all? This is a very good question, and my opinion and answer will always be no. I have always been weary of companies that offer a \u0026ldquo;full stack Landing Zone\u0026rdquo; which is ready \u0026ldquo;as is\u0026rdquo;. Some companies probably do have a good amount of Inner Source material that can be used to build a Landing Zone up quickly, but not a lot of companies do. A fair few of the big name players have \u0026ldquo;their way or the highway\u0026rdquo; version of the Landing Zone to deploy.\nIt\u0026rsquo;s like asking everyone on the planet being asked to wear a size 10 shoe (UK size of course!). For me it would be slightly too big, and one of my friends it would be a perfect fit, but for others, I am sure there will be a fair few jokes! However, that is the point of having multiple types of shoes, sizes, and uses, they are there so people can choose what works for them.\nThen there is the header photo, chosen because for me airports are a pretty good example of what a Landing Zone is. An airport must have some very specific elements to it to be able to accept the plane for landing. One key one is a runway of some description. However, for larger international airports, you would need a level of security as well. Then again, when you go to a larger airport you will see shops, duty free, taxi lanes, car parks, you name it - but are they required? No, but it does make things a lot easier!\nThis type of modular approach is how we should look at Landing Zones, and this is where we move from the Landing Zone being a static design, and into a concept.\nLanding Zone - The Concept Let\u0026rsquo;s dive into the concept here, and for me the Landing Zone is key to any implementation on any cloud provider. It provides the foundation for using cloud services, and ensures that from the start the Landing Zone meets the pillars of the AWS Well-Architected Framework - Operationally Excellent, Security, Reliable, Performance Efficient, Cost Optimised, and Sustainable).\nMandatory Elements These elements of a Landing Zone are pretty much the core, and foundation, of what you intend to deploy on AWS. Without these elements, the Landing Zone will likely not follow the Well-Architected Framework, and probably not work at all! Don\u0026rsquo;t let the list fool you either, it does seem pretty short, but sometimes its the smallest implementations that can cause the most hassle!\nAWS Account Structure Typical Large Scale AWS Account Structure Example This has to be the first step in the creation of the Landing Zone - the Account Structure. Ensuring a safe, secure, and best practice driven structure, you will always have to consider multiple accounts with AWS. My personal opinion and highest recommendation is that even if you are a small customer, that only needs a single account, still have a minimum of 2.\nIn the world of AWS, the top level account is known as the \u0026ldquo;Management Account\u0026rdquo;, and while it is just a \u0026ldquo;normal account\u0026rdquo; that resources can be spun up in, it is not the best practice. Keeping this \u0026ldquo;Management Account\u0026rdquo; clear for just managing AWS will always be best practice. Within this account, you should set up your AWS Organisation, and your Identity Management. From here, you have the option to delegate responsibility for these items to member accounts within your organisation. With the top of the tree in place, we can continue.\nAuthentication Example of using an external IdP (EntraID) to log into AWS With your account structure defined, you will need a way to be able to authenticate to AWS and the member accounts. This is where a secure and centralised Identity Provider (IdP) comes into play. There are many examples of ways this can be done, and with the example above, we are using AWS IAM Identity Centre (originally called AWS SSO). One of the major benfits of using AWS IAM Identity Centre (IIC) to authenticate, is that you can also use both external based IdP, such as Okta and Entra ID, but also it has its own database that can be used to manage identities on as well.\nNow you should be able to give access to your AWS platform in a centralised identity provider as per the Well-Architected Best Practices.\nSecurity Configuration Example of a Conformance Pack Template used to audit your platform I would normally have this front and centre, and at the top of the list of mandatory elements as security is NEVER an option. However, in this case you do actually need something to apply the security to! Therefore, the last mandatory element of the Landing Zone Concept, is Security, and your Security Configuration.\nIt shouldn\u0026rsquo;t come as a surprise to anyone that no matter what you are doing online, security must be your number one priority. The issues with data breaches is well known, and can affect millions, not just you. Within AWS there are several tools which can help \u0026ldquo;assess, audit, and evaluate\u0026rdquo; your estate, such as AWS Config, as seen in the example above. Without becoming just a very long list of AWS Security Products which AWS do very well at listing, I want to keep this close to the concept for this blog.\n⚠️ Remember: This isn\u0026rsquo;t just about security of access, this is about security of the whole platform, its usage, availability, and logging of all the actions.\nThere are many tools that can be used to ensure that the security of your platform is as good as it can be, compliant as it needs to be, and safe as it should be, and making sure you are alerted to these and take action quickly all play into the concept of security.\neverything will eventually fail over time\n― Werner Vogels, allthingsdistributed.com Took me a while to find that quote! I have seen it written as \u0026ldquo;Everything fails, all the time\u0026rdquo;, which is still a fantastic quote - but I couldn\u0026rsquo;t see where Werner had actually said that directly! (Personally I thought it was at a Re:Invent Video, but all my copies show others pasting the text on top of the screen with him on it!). Either way, I found a 2016 article that had him say something very close! However, think about it - while that applies to a lot of things, its quite clear, EVERYTHING will eventually fail, and you can say the same for security too. The Cat and Mouse game of keeping up with bad actors is never going to stop. At some point your security will fail, IF you do not maintain it, and keep on top of it, and update it.\nOnce again, Security is mandatory. Not just for a Landing Zone, but pretty much generally!\nOptional Elements Only 3 mandatory elements? Doesn\u0026rsquo;t seem right\u0026hellip; but for me, that is really it. The Landing Zone is whatever you need it to be to work with your product, application, usage, the list goes on. Some of you may be thinking why something like \u0026ldquo;Networking\u0026rdquo; isn\u0026rsquo;t on the list. In a way, the key bits you need to worry about if you deploy a network is, (the security), but do you actually need it?\nNetworking and VPCs Serverless Application, using Lambda and DynamoDB with no network So, here is an example - say you have a Serverless Application, you are using AWS API Gateway with AWS Lambda then using an Amazon DynamoDB backend. Simple enough, handy if you want to build an application that can return something specific for you, or log something, up to you, however there is no VPC network involved. All of this is happening using AWS\u0026rsquo;s own Managed Networking, and API calls. The API Gateway Endpoint is available to you, and your application runs.\nYou should always remember the rule mentioned above though - Security is front and centre at all times. Even without any specific networking deployed, you are using a network of some sort, and you must continue to secure it. In this case, the network deployment is optional, but you still have to consider the risk when working with services in this way.\nStill, you could add more networking, if you need it!\nGo complex with your networking! AWS Control Tower A completed AWS Control Tower configuration Probably my most controversial optional on the list, and this comes back to being a concept as well. AWS Control Tower is a tool that can be used to deploy AWS\u0026rsquo;s own Opinionated Landing Zone, with that it comes with some very specific set of security configurations, authentication options, and account design. All the key required elements for your platform, by why state this is optional.\nThis goes back to the start of this post, where a size 10 shoe on someone that it doesn\u0026rsquo;t fit on, wouldn\u0026rsquo;t work. Using the right tool for the right scenario will be your path to success. Using AWS Control Tower therefore is optional, if the service works for you, then use it. For many users of Control Tower, it will work as they expect it to, and cover a lot of the main areas and requirements for your platform. However, so long as your platform contains the key required elements to make this work, then you still have a valid landing zone.\nYour AWS Platform - As The Concept The Landing Zone with services examples Going back to the start, the original AWS Account design that was shown, is a valid Landing Zone concept being deployed to any customer. In the example above, we have started to fill out keys services that will be used across the platform.\nAWS Account Structure - Multiple accounts used for multiple types of services Authentication - Using AWS IAM Identity Centre to act as a single point of entry into the platform Security - Using AWS Security Hub, A Logging account separated to ensure security across the platform The additional optional elements are also present\nAWS Control Tower - A central tenant of building a save, secure, and Well-Architected Landing Zone Networking - Seen with a separate Transit account. But note, that your Landing Zone will still work, without those optional elements!\nConclusion The example shown here, might not work for you, it might be too complex, or it could be not as segmented as you require for compliance reasons, or an internal security policy. However, you can still use the concept to build your own AWS Platform. Concepts build the framework for your deployment. Frameworks allow you to use your skills, knowledge, and expertise to decide the right tool for the right job, and if not someone should be able to support you in that decision. However, the tool of choice, for it to be any good, must follow the concepts in this blog.\n","date":"2024-02-25T10:30:00Z","image":"https://static.colinbarker.me.uk/img/blog/2024/02/pascal-meier-UYiesSO4FiM-unsplash.jpg","permalink":"https://colinbarker.me.uk/blog/2024-02-25-aws-cloud-landing-zone/","title":"Cloud Landing Zones on AWS"},{"content":" Header photo by Kirill Sh on Unsplash\n⚠️ I am posting this as I was very close to completing this prior to changing roles. The post here was initially started in February. As such I had only two parts to complete at the end. Rather than lose where I was, I thought it best to post this, so that people could see where I am at. I would hope that this will be completed in the near future. ⚠️\nContinuing on from Part 1, this guide will hopefully show how a product created by Check Point, can be used as a network security appliance, running on EC2 instances, within an auto-scaling group, behind a Gateway Load Balancer, to create a highly available, scalable, networking security appliance that almost acts like an on-premise deployment.\nNOTE: There are multiple network appliances out there that work with the AWS Gateway Load Balancers, and a Firewall is just one of the many different uses of this service. As I have recently worked with a customer deploying this using the CloudGuard Network Security for the Gateway Load Balancer appliance, it made sense for me to write up how this was possible using this appliance directly!\nHow does the Check Point CloudGuard work? With any deployment in the cloud that needs to emulate or be operated in the same way as any on-premise deployment, there does need to be a little description as to why this needs to exist in the first place. I have always been very keen to push the message that you need to be Cloud Native while on Cloud Platforms however, there is always a requirement which can\u0026rsquo;t be met in this way. Lucky for organisations, if you know what AWS can offer, this can easily be used to an organisations benefit, and enable a smooth pathway to evolving into a Cloud Native organisation. You just need to take that first step!\nLet\u0026rsquo;s look at Check Point\u0026rsquo;s primary offering. These are dedicated or virtual appliances that can be installed in offices, data centres, or even larger home deployments. They are powerful systems that offer a wide range of security and connectivity services built into the deployments. Check Point make the global remote management of these appliances a lot more simple for organisations by using a Security Management solution, that centralises the configuration, provisioning, and monitoring of these appliances.\nBasic CheckPoint deployment For larger organisations, security or networking teams having access to this unified console for all network and security management is vital for the continued operation of their company. Moving to a new product very quickly can cause issues with compliance against regulatory requirements that might not be as simple to change overnight.\nAdditionally, with the Shared Responsibility Model might require that the organisation takes on additional responsibility for the network, that a cloud provider might not be able to provide. Typically for compliance or regulatory reasons.\nWouldn\u0026rsquo;t it be great if you could run the appliance as an EC2 instance on AWS, and manage it in the usual way?\nCloudGuard on AWS (the old way!) Well, it would be simple to just say \u0026ldquo;yes\u0026rdquo; here, but it never ends up being as simple as that however, you will be surprised how quickly this can be resolved, when you know what the AWS services can offer, and how they can be used to an organisations advantage.\nSo in the first instance, it is pretty simple to set up an Amazon EC2 instance using the AWS Marketplace AMI for one of the Check Point Appliances, right?\nCheckPoint AWS Cloud based deployment Big issue here, it\u0026rsquo;s a single appliance, sitting in a single Availability Zone. This would not follow any AWS Well-Architected best practices, and could leave an organisation at risk, if something were to happen to that instance. For example, if that instance crashed, then connectivity to the services behind the appliance would all become unavailable. Add to that, that the Elastic IP would be attached to the single instance, having written a lot of custom scripts to enable this IP to jump to a new instance, all this work will take time to complete. Time is something larger organisations can not afford, or also would be a breach potentially a number of regulatory or compliance requirements.\nCloudGuard on AWS (the right way!) As you probably know, Amazon EC2 AutoScaling is always the best practice for application deployments on EC2. Have your application be ephemeral, spin up and automatically be ready to serve the application to the expected clients, scale with demand, and always be available in the event one of the EC2 instances stops working. So why should that be any different for a network appliance?\nThe biggest issue that existed with older cloud deployments, is that the network was very much set in stone, it was controlled by the cloud provider, and you gave up the responsibility of this network to them. Putting an appliance in the cloud meant some level of sacrifice was needed, to ensure that the appliance worked. Basic deployment of the VPC, wouldn\u0026rsquo;t give you the options to be able to share clustered based IP\u0026rsquo;s, or having a simple solution to splitting the traffic.\nThis is where the AWS Gateway Load Balancer comes into play.\nCheckPoint in AWS with an Auto-Scaling Group and Gateway Load Balancer From here we are starting to see a deployment that is slightly more recognisable. NAT Gateways sit at the outer edge, that allow traffic out, the Check Point Appliances are sitting in a Private Subnet, and the Gateway Load Balancer is connected to the Auto-Scaling Group balancing the traffic between the appliances.\nPutting all the elements of Check Point together Now that we have a design for the autoscaling CloudGuard appliances, we need to deploy the rest. To do this we do need to jump back a little bit, and ensure we have set up the right elements for the Check Point Security Management service.\nFirstly, the Check Point Security Management service. I\u0026rsquo;ll go into this into much more depth in a future blog post, but for the moment, we just need to ensure that the service has access to talk to the subnets that you will have the autoscaling CloudGuard instances.\nDifferent connectivity options for the Security Appliance As you can see, the Security Management Service will run on a single instance, but it is also possible to run the service in a High Availability mode. For this example, lets just consider the diagrams of the Security Management Service as one of these highly available modes.\nSimple connectivity is needed to the CloudGuard instances, much like you would need connectivity to the physical or virtual appliances on site. A site-to-site VPN or AWS Direct Connect will always be a preferred way to connect to them, especially as the final set up will involve putting the CloudGuard instances into a private subnet.\n⚠️ This was where I reached before changing roles, at some point I will re-visit this as I will have to get up to date code to continue. Apologies for the delay in getting this article out! ⚠️\n","date":"2023-11-20T17:19:10Z","image":"https://static.colinbarker.me.uk/img/blog/2023/11/kirill-sh-eVWWr6nmDf8-unsplash.jpg","permalink":"https://colinbarker.me.uk/blog/2023-11-20-gwlb-part-2/","title":"How can we use an AWS Gateway Load Balancer? (Part 2)"},{"content":" Header photo by Marvin Meyer on Unsplash\nWhat is a game day? A game day sees a group of people solving a fictional problem or developing a new idea. They’re given a short period of time – usually hours, but sometimes days or a couple of weeks – to complete the challenge. The beauty of game days is that they’re inherently flexible. Participants can meet in person or virtually. They might be existing team members and colleagues, or complete strangers. They could have shared skills and experiences or come from a wide range of backgrounds and disciplines.\nSome of the best events I’ve been part of involve technical and non-technical people working together. Having both organised and participated in game days, I think they’re an excellent way to foster a growth mindset. They can also play a valuable role in change management if you’re introducing new ways of working or integrating teams. Game days are great fun, but they also deliver serious business benefits.\nTen great things about game days\nStrengthen relationships The time constraints of game days create a high-octane environment. As pressure mounts to complete the task participants have to rely on each other to get jobs done. This can result in a feeling of camaraderie that lasts far longer than the event itself. It’s a great way to build bridges and nurture relationships between departments, global teams or people with different levels of experience.\nExtend skills Game days often require participants to draw on a wide range of skills. The core challenge might be technology focused, but solving it is likely to require leadership, mentoring, communication and decision making. Participants are often pushed out of their comfort zone, and it’s great to see people rise to the occasion. You may discover colleagues have hidden talents that make them a great fit for future projects.\nPush boundaries We all know that it’s important to drive continual improvement, but it’s also easy to get stuck in the rut of day-to-day tasks. Game days encourage you to think outside the box. They offer a safe space to push boundaries and deal with any consequences. Sometimes this reveals new and better ways of working that can be implemented in the real-world.\nTry new things One of the best – and most stressful – games days I took part in saw the organiser continually introducing faults that we had to go and fix. We’d complete one task, and another bigger problem would emerge. As the challenges escalated, I found myself reaching for tools and techniques that I’d read about but hadn’t yet used. It was a great opportunity to try new ideas and learn on the fly.\nRoad-test plans Game days can be a powerful test bed for activities like cloud disaster recovery. Having a plan in place is all well and good, but staging a worst-case scenario enables any gaps to be identified and rectified. This can form a central part of regular disaster recovery reviews.\nPlan for failure Look at your architecture and think about where things could go wrong or what a malicious actor might do. Structuring a game day script around this can be an interesting way to test the resilience of cloud-based systems.\nEncourage innovation The open-ended nature of game days, coupled with their removal from day-to-day priorities, makes them a fertile ground for innovation. They can also create a dynamic space for people with different perspectives and capabilities to spar and stimulate each other. While the timeframe is limited, teams can be surprisingly productive. One game day I hosted resulted in a team developing a chatbot to solve a business challenge. Soon afterwards it was launched on the company’s website.\nSolve problems or avoid future issues Building a game day around a specific business goal or priority can be very effective. In a cloud context, you could take a theme like security or cost-efficiency and use that as the focal point. The event might deliver tangible outcomes or ideas that can be implemented in the real world. At the very least, it will give participants a deeper awareness and understanding of the topic.\nFoster psychological safety Cloud adoption can be really hard on individuals as they get to grips with new ways of working. People tend to adapt more quickly and embrace the inevitable challenges more willingly if their team and the wider workplace is psychologically safe. This is a huge cultural issue, and game days can’t be used as a sticking plaster solution. But they can help building trust and creating an environment where people are comfortable with experimentation and don’t fear failure.\nAccelerate progress Game days are a brilliant vehicle to improve cohesion between and within teams. They encourage people to stretch their capabilities, think differently and try new things. They’re exciting, stimulating and challenging. All of this brings out the best in people, sharpening their skills and refreshing their perspectives. And it greases the wheels to enable quicker, more seamless progress with game-changing initiatives like largescale cloud adoption.\n","date":"2023-10-24T22:18:28Z","image":"https://static.colinbarker.me.uk/img/blog/2023/10/marvin-meyer-SYTO3xs06fU-unsplash.jpg","permalink":"https://colinbarker.me.uk/blog/2023-10-24-ten-reasons-to-make-game-days-a-regular-event/","title":"Ten Reasons to make Game Days a regular event"},{"content":" Header photo by NASA on Unsplash\n⚠️ Note: This is Part 3 of my IPv6 on AWS series, the first part is available here, and the second part here). ⚠️\nIntroduction With this post, we look into how an AWS Network Firewall can be used in a DUALSTACK mode to cover both IPv4 and IPv6. In a future post, we will go into more depth into how we can enable this for IPv6 only, but in this case we are using this as a stepping stone. The main reason for me to go into this detail is finding this throughout the current eco system seemed to be missing all the key steps in a single location! I am sure there will be soon, but ultimately this was my experience and how we can get over to an IPv6 world.\nWhere to begin? To start, I will be taking over from one of the AWS Network Firewall example architectures. A AWS Network Firewall with a NAT Gateway. Instead of going over essentially what is already a known design pattern, I will just cover enough to get us started.\nStandard IPv4 AWS Network Firewall Design Diagram a standard IPv4 network, using AWS Network Firewall and a NAT Gateway The AWS Network Firewall is a centralised managed Firewall Appliance that allows you to scale and protect your workloads in AWS. It uses the AWS Gateway Load Balancer and GENEVE Protocol the that we have covered in a previous post What is an AWS Gateway Load Balancer anyway?. The main difference is that the EC2 appliance we used in this post, is replaced with the AWS managed service.\nWhere does IPv6 fit in? Looking back at part one, I mention in Adding IPv6 outbound routing to the private subnets that with IPv6 networks, having a NAT to expand the IP ranges, isn\u0026rsquo;t really needed. You can assign each compute element with its own publicly routable address.\nSo with this in mind, the \u0026ldquo;application EC2 instance\u0026rdquo; seen in the design above, would get it\u0026rsquo;s own IPv6 address, and wouldn\u0026rsquo;t need to be NATted.\nTerraform behind the IPv4 solution Let\u0026rsquo;s start with the original diagram. The code for this is up at 04-network-firewall-ipv4 if you wish to see the whole repo, but here are snippits of the important bits.\nSomething I have had a few issues with, was getting a easier mapping from the AWS Network Firewall to the VPC Endpoints in each subnet, so within the locals.tf file, I have generated this block to export the mappings. It sets the key for each of the endpoints as the availability zone that it has been configured in, which will be handy later on when we try and map to the right routing table.\n1 2 3 4 # Generate a list of the network firewall endpoints so the route tables can use them locals { networkfirewall_endpoints = { for i in aws_networkfirewall_firewall.firewall.firewall_status[0].sync_states : i.availability_zone =\u0026gt; i.attachment[0].endpoint_id } } Next we have the routing table for the Internet Gateway. As mentioned my previous post under How does this work in AWS, the key to routing traffic inbound to the right location, is an Edge Associated VPC Route table. In summary, these are added to the edge, and the Internet Gateway (IGW) to override the standard routing, and push traffic directly to the Gateway Load Balancer (GWLB) Endpoint for the Network Firewall.\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 # Starting with the route table to be assigned to the Edge, this is the Internet Gateway\u0026#39;s route table resource \u0026#34;aws_route_table\u0026#34; \u0026#34;igw\u0026#34; { vpc_id = aws_vpc.example.id } resource \u0026#34;aws_route\u0026#34; \u0026#34;igw_to_firewall\u0026#34; { # We use a count here to go over each of the NAT subnets that exist, to create a route for each based on the Availability Zone count = length(var.nat_subnets) route_table_id = aws_route_table.igw.id destination_cidr_block = var.nat_subnets[count.index] vpc_endpoint_id = local.networkfirewall_endpoints[var.availability_zones[count.index]] } # Associate the Route table to the Edge IGW resource \u0026#34;aws_route_table_association\u0026#34; \u0026#34;igw\u0026#34; { gateway_id = aws_internet_gateway.transit.id route_table_id = aws_route_table.igw.id } For the NAT Gateway Subnet, we would have the return route back into the Network Firewall for outbound traffic.\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 # Creation of each of the NAT Gateway subnet route tables that point to the Network Firewall resource \u0026#34;aws_route_table\u0026#34; \u0026#34;nat\u0026#34; { # We are using a count here, because we need to create a route table for each Availability Zone count = length(var.nat_subnets) vpc_id = aws_vpc.transit.id } # Create a route that tells the NAT network to route traffic to the internet via the NWF resource \u0026#34;aws_route\u0026#34; \u0026#34;nat_to_firewall\u0026#34; { # We are using a count here, because we need to create a route for each Availability Zone count = length(var.nat_subnets) route_table_id = aws_route_table.nat[count.index].id destination_cidr_block = \u0026#34;0.0.0.0/0\u0026#34; vpc_endpoint_id = local.networkfirewall_endpoints[var.availability_zones[count.index]] } # Associate the Route table to the NAT Subnets resource \u0026#34;aws_route_table_association\u0026#34; \u0026#34;nat\u0026#34; { # We are using a count here, because we need to create an association for each Availability Zone count = length(var.nat_subnets) subnet_id = aws_subnet.nat[count.index].id route_table_id = aws_route_table.nat[count.index].id } And then for the Private Subnets where the applications, EC2 instances, or any service that requires access to the internet from a Private Subnet, we will need to route all traffic through to the NAT Gateway in each of the Availability Zones.\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 # Creation of each of the private route tables that point to the NAT Gateway resource \u0026#34;aws_route_table\u0026#34; \u0026#34;private\u0026#34; { # We are using a count here, because we need to create a route for each Availability Zone count = length(var.private_subnets) vpc_id = aws_vpc.transit.id } # Create a route that tells the private network to route traffic to the NAT GW in each AZ resource \u0026#34;aws_route\u0026#34; \u0026#34;private_to_natgw\u0026#34; { count = length(var.private_subnets) route_table_id = aws_route_table.private[count.index].id destination_cidr_block = \u0026#34;0.0.0.0/0\u0026#34; nat_gateway_id = aws_nat_gateway.nat[count.index].id } # Associate the Route table to the Private Subnets resource \u0026#34;aws_route_table_association\u0026#34; \u0026#34;private\u0026#34; { count = length(var.private_subnets) subnet_id = aws_subnet.private[count.index].id route_table_id = aws_route_table.private[count.index].id } So far, pretty much standard for a GWLB setup. This should cover everything that is needed to get the routing element working for the VPC.\nLets bring in IPv6 Solution Design Updated Diagram showing the IPv6 routes in a DUALSTACK setup With some minor adjustments to our routes, we are able to add the correct routing in place for IPv6 to pass traffic through the AWS Network Firewall. The difference being, is that anywhere that can have an IPv6 address, we would route it directly to that Availability Zone\u0026rsquo;s Network Firewall Endpoint (The Gateway Load Balancer endpoint).\nSome major changes though, will be having to convert the existing IPv4 AWS Network Firewall endpoints into what is known as a DUALSTACK address type. This however, isn\u0026rsquo;t as easy as just updating the Terraform. As per the documentation it is not possible to change the IPAddressType after you have set the subnet.\nHowever, if you were to just change the value in Terraform, you will receive the following error message from Terraform:\nError: associating NetworkFirewall Firewall (arn:aws:network-firewall:eu-west-2:012345678910:firewall/vpc-network-nfw) subnets: InvalidRequestException: subnet mapping(s) is invalid. You can't change the IP address type of an existing subnet. Either remove the subnet from the request or change the IP address type to match the subnet's original value, and try again, parameter: [[{\u0026quot;subnetId\u0026quot;:\u0026quot;subnet-01234567891012345\u0026quot;,\u0026quot;ipaddressType\u0026quot;:\u0026quot;DUALSTACK\u0026quot;},{\u0026quot;subnetId\u0026quot;:\u0026quot;subnet-5432109876543210\u0026quot;,\u0026quot;ipaddressType\u0026quot;:\u0026quot;DUALSTACK\u0026quot;},{\u0026quot;subnetId\u0026quot;:\u0026quot;subnet-1111111111111\u0026quot;,\u0026quot;ipaddressType\u0026quot;:\u0026quot;DUALSTACK\u0026quot;}]]\nIn this example, we have three subnets as part of our solution, of which all three were originally setup as \u0026ldquo;IPV4\u0026rdquo;. So we are given two options for migration.\nRemove the AWS Network Firewall and redeploy with the \u0026ldquo;DUALSTACK\u0026rdquo; setting on each subnet Manually change the firewall settings, and re-import back into the Terraform state. In a lot of cases, the first option will be hard because you can\u0026rsquo;t always delete the Firewall that might be in production, and the second option you have the issue that you can\u0026rsquo;t add in a new endpoint in the same availability zone while the previous one exists. However, making the manual changes is never in the spirit of IaC.\nThat being said, the steps to do a manual change would be with minimal outages, but also potentially additional cross AZ networking fees would be to do the following:\nComplete once for each availability zone\nReplace the route on the Edge Associated route table that points to the Gateway Load Balancer endpoint for that Availability Zone to point at another subnets endpoint Replace the route from the Route Tables associated to the NAT Gateway subnets that point to the Gateway Load Balancer for that Availability Zone to point at another subnets endpoint Edit the AWS Network Firewall and remove the endpoint from the list of Firewall Subnets for that Availability Zone only - save the changes and wait ⚠️ NOTE While the console will show that it has successfully updated, it will error out if it is still deleting the endpoint, this could take up to 20 minutes to complete. ⚠️ Re-edit the AWS Network Firewall and add back in the subnet specifically selecting \u0026ldquo;DUALSTACK\u0026rdquo; for the IP Address Type, and hit save. Replace the route on the Edge Associated route table that points to the other subnet, back to the new Gateway Load Balancer endpoint for original subnet. Modify the Terraform to switch the ip_address_type in the subnet_mapping block to be DUALSTACK At this point, run a Terraform Plan to ensure that the state and the Terraform code match up.\n⚠️ NOTE This process can take a few hours to complete! ⚠️\nTerraform behind the IPv6 solution Moving on from the above, at this point we need to add in the routing so that the subnets can use their IPv6 addresses to connect to through the AWS Network Firewall. If you would like to see the code directly, there is a second version of the code up at 05-network-firewall-dualstack.\nIPv4 and IPv6 Routing A rule of the VPC Route Tables, is that you need to separate out your routes for IPv4 and IPv6. This can come in handy later on when we discuss what to do in place of the NAT Gateway, however, this example below shows you what a route table should look like for a public facing subnet behind the AWS Network Firewall.\nRouting on the Edge The route table here will be a little simpler, as all traffic needs to head to the Gateway Load Balancer VPC endpoint for the Availability zone that it is in. In this case, this can be generated from knowing which of the IPv6 networks are in each Availability Zone.\nA route table showing both an IPv4 and IPv6 route to the Gateway Load Balancer VPC Endpoint and the NAT Gateway on the Edge Associated route Routing in the private subnet A route table showing both an IPv4 and IPv6 route to the Gateway Load Balancer VPC Endpoint In this specific case, we can see that both the IPv4 \u0026ldquo;Route All\u0026rdquo; 0.0.0.0/0 route goes to the same VPC Endpoint as the IPv6 \u0026ldquo;Route All\u0026rdquo; ::/0 route. This route table is generally used for the Public Subnet behind an AWS Firewall, or in our case the NAT Gateway Subnet. Here the target of both is the same endpoint set up for this Availability Zone.\nRouting in the NAT subnet One major difference comes in specifically on the Private or Application Subnets. These are the ones that typically you would have routed through your NAT Gateway, which in an IPv6 world, you wouldn\u0026rsquo;t need to do, as every device can have its own IPv6 address.\n⚠️ NOTE In a future post, I will go into methods to ensure that IPv6 addresses are hidden, however, for the use case of the AWS Network Firewall, it isn\u0026rsquo;t currently possible to do this. ⚠️\nA route table showing both an IPv4 and IPv6 route to the Gateway Load Balancer VPC Endpoint and the NAT Gateway for a private network As you can see in the above image, we are routing IPv4 traffic as normal to the NAT Gateway for the Subnet, but for the IPv6 traffic, we are routing that to the same Gateway Load Balancer VPC Endpoint that we had in this Availability Zone. As services in the IPv6 subnet will have their own publicly routable IPv6 address, this means that we can bypass the NAT Gateway.\nTerraform for routing IPv6 For the Edge Route table, we will be sending the new IPv6 traffic destined for the Private Subnet, to the Gateway Load Balancer VPC Endpoint in each of the availability zones.\n1 2 3 4 5 6 7 8 9 10 # Create a route that sends all IPv6 traffic for the Private Subnets to the NWF Endpoint # This is due to IPv6 not being NAT\u0026#39;d, so each application gets its own IPv6 address resource \u0026#34;aws_route\u0026#34; \u0026#34;igw_to_firewall_ipv6\u0026#34; { count = length(var.private_subnets) route_table_id = aws_route_table.igw.id destination_ipv6_cidr_block = aws_subnet.private[count.index].ipv6_cidr_block vpc_endpoint_id = local.networkfirewall_endpoints[var.availability_zones[count.index]] } For the NAT Subnet and the Private Subnet, both of these routes will end up being the same, as technically we have made the Private Network routable through the use of IPv6 addresses, as such we can include the following two bits of code\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 # Create a route that tells the private network to route IPv6 traffic to the NAT GW in each AZ resource \u0026#34;aws_route\u0026#34; \u0026#34;private_to_natgw_ipv6\u0026#34; { count = length(var.private_subnets) route_table_id = aws_route_table.private[count.index].id destination_ipv6_cidr_block = \u0026#34;::/0\u0026#34; vpc_endpoint_id = local.networkfirewall_endpoints[var.availability_zones[count.index]] } # Create a route that tells the NAT network to route IPv6 traffic to the internet via the NWF resource \u0026#34;aws_route\u0026#34; \u0026#34;public_to_firewall_ipv6\u0026#34; { count = length(var.nat_subnets) route_table_id = aws_route_table.nat[count.index].id destination_ipv6_cidr_block = \u0026#34;::/0\u0026#34; vpc_endpoint_id = local.networkfirewall_endpoints[var.availability_zones[count.index]] } Within Terraform, we can add the routes for each of the route tables as follows.\nIn Summary AWS Network Firewall is a great product when used in the right way, and ensuring the right routing is in place to make this work is a major key point in this. Hopefully, my own pathway to make this work has helped you out, as this took me a long time to actually get working in the end!\nThere are still some elements that I believe do need improving on, for example - I have made a typically private network into a public network, albeit behind the firewall. While not having a NAT Gateway is one of the key benefits of IPv6 networking, it would still be an issue for some customers who really did need a secure private network. While there is the option for the Egress Only Outbound Gateway, currently this won\u0026rsquo;t sit behind the AWS Network Firewall, and ultimately provides a route to the internet that isn\u0026rsquo;t filtered in anyway. One suggestion would be to use a NAT instance, or other device to route traffic through, but this will be a post for another day.\nOne final element to this journey, just a few months ago AWS announces IPv6 only networking support for the AWS Network Firewall. We shall use this as part of the future post on an IPv6 only VPC.\nThanks again for reading!\n","date":"2023-07-12T19:40:58Z","image":"https://static.colinbarker.me.uk/img/blog/2023/02/nasa-Q1p7bh3SHj8-unsplash.jpg","permalink":"https://colinbarker.me.uk/blog/2023-07-12-enabling-ipv6-on-aws-using-terraform-network-firewall-part-3/","title":"Enabling IPv6 on AWS using Terraform - AWS Network Firewall (Part 3)"},{"content":" Header photo by Alicja Ziajowska on Unsplash\nCloud security fears According to a study by cloud security specialist Barracuda, many IT security professionals are reluctant to host highly sensitive data in the cloud. This is especially true when it comes to customer information (53%) and internal financial data (55%). Respondents also said their cloud security efforts are hampered by a cybersecurity skill shortage (47%) and lack of visibility (42%). More than half (56%) are not confident that their cloud set-up is compliant.\nIt’s likely that these concerns are felt even more acutely in financial services organisations. Yet the opportunities and benefits of cloud adoption are too great to be ignored.\nTo help address this dichotomy, AWS published a Financial Services Industry Lens for its Well-Architected Framework. The document shows how to go beyond general best practice to satisfy stringent financial services industry demands. It focuses on “how to design, deploy, and architect financial services industry workloads that promote the resiliency, security, and operational performance in line with risk and control objectives…including those to meet the regulatory and compliance requirements of supervisory authorities.”\nIt’s critical that financial services workloads hosted on AWS are aligned with the document’s design principles.\nWell-Architected financial services IT security risks come from within the organisation as well as outside it. Employee error or negligence can pose a significant threat, not to mention the damage that can be inflicted by malicious insiders. Whether data is stored in the cloud or on-premise, technical solutions must go hand-in-hand with operational measures.\nFor financial services organisations, least-privileged access or a ‘zero trust’ philosophy is a good starting point. AWS also advocates four principles to underpin the design of cloud-based architectures for financial services workloads:\nDocumented operational planning Automated infrastructure and application deployment Security by design Automated governance Automating infrastructure, application deployment and governance is perhaps daunting for organisations that are new to the cloud. However, it enables security to advance to a higher level than can ever be achieved on-premise. By minimising human involvement, it significantly reduces risk of error and improves consistency. It also allows quicker execution and scaling of security, compliance and governance activities.\nFinancial services organisations undergoing largescale migration would be well advised to refactor workloads for the new environment. This provides a valuable opportunity to introduce automation alongside security by design approaches. The additional upfront investment will go a long way towards addressing security concerns in the cloud.\nAs well as outlining general principles for the good design of financial services workloads, AWS illustrates six common scenarios that influence design and architecture. These include financial data, regulatory reporting, AI and machine learning, grid computing, open banking and user engagement. The list is not meant to be exhaustive, but an additional scenario that we regularly encounter is that related to network connectivity.\nRead on to find out how we handled a financial data scenario for a customer that specialises in employee pay processes.\nFinancial data in cloud-based workloads According to AWS guidance, any financial data architecture should exhibit three common characteristics:\nStrict requirements around user entitlements and data redistribution. Low latency requirements that change depending on how the market data is used (for example, trade decision vs. post trade analytics) and which can vary from seconds to sub-millisecond. Reliable network connectivity for market data providers and exchanges But how are these upheld during mandatory audits, especially when it comes to user entitlement and data redistribution? This was the challenge facing one of our customers as it prepared for largescale cloud migration. An internal firewall appliance solution needed to meet the auditing requirements of regulatory bodies without compromising security standards.\nOur solution involved the CloudGuard platform from cybersecurity specialist Check Point, enabling traffic to be scanned securely in a way that met regulatory stipulations. We also suggested modernising the security set-up to make the important transition from a ‘pet’ to a ‘cattle’ mindset. This paved the way for a more automated approach firmly aligned with security by design principles as well as allowing the customer to build out scalable groups behind the scenes. Our approach is rooted in the assumption that failure is inevitable, looking to minimise the damage that occurs when it happens. In this way, it supports both the ‘reliability’ and ‘security’ pillars of Well-Architected.\nA high level overview Diagram of an example FSI\u0026#39;s Customer Network As the below diagram shows, we used AWS Gateway Load Balancers (GWLB) on the third layer of the networking Open Systems Interconnection (OSI) model, dynamically routing traffic to firewall appliances. As with any Application Load Balancer (Layer 7) or the Network Load Balancer (Layer 4), GWLB can step down to the network layer, using the GENEVE protocol, removing the need for ‘static’ or ‘pet’ instances to run the firewall appliance. financial services networking set up\nGWLB understands and can control the physical path that data takes without affecting the Transport Layer (4) which primarily deals with transmission protocols. This enables the use of autoscaling to dynamically scale firewall appliances in response to changes in traffic requirements. It’s similar to how an Application Load Balancer understands and translates application traffic, forwarding it to the correct endpoint. And it’s cheaper than running the appliances all the time.\nAs with all AWS’ Elastic Load Balancers, you can configure AWS Private Link to set up endpoint services and endpoints within subnets. We’ll come back to this later.\nThe above design also deploys the AWS Transit Gateway (TGW) service. This ‘level up’ on the original virtual private cloud (VPC) peering option allows you to configure deeper routing of traffic between VPCs. It’s highly available, secure, and ensures traffic never needs to leave the security of your AWS network. Exposure from the public internet is reduced as there are no internet-facing endpoints. And it’s also possible to include completed routing that pushes traffic to a centralised VPC, in this case ‘Egress’. Traffic bound for the internet is routed through the TGW service, and then through the Check Point appliance. This in turn uses NAT Gateways to talk to the internet.\nIndustry-specific requirements Using these services together, you can configure Internet Gateways assigned to VPCs to allow modification of routes for external traffic. It’s also possible to associate a standard VPC route table to the edge using an ‘edge association’. By pointing the routes for any external IP to the previously created TGW VPC endpoints, this traffic can be:\nencapsulated and sent through the Check Point appliances, filtered, scanned, checked, and firewalled, returned to the service that exists in the Public Subnet, sent to the final destination. In our diagram, the Application Load Balancer in the Public Subnet balances traffic to EC2 instances in an autoscaling group.\nThe final service is the AWS Resource Access Manager (RAM), which is part of AWS Organizations. Centralised network administration is a key compliance requirement for several financial services certifications or programmes. However, configuring a VPC for access from the internet requires an Internet Gateway, which poses a security risk. Deploying the VPC and Internet Gateway into a central network account provides full control and visibility for relevant people. However, building resources in this VPC would normally require additional work to give appropriate levels of access. AWS RAM solves this by sharing resources with other accounts.\nIn the above example, two of the four subnets required for the set-up to work have been shared. So, the account with the shared subnets will be able to build resources and have full control of the process. However, it will not be able to bypass routing options set in place by the networking team, specifically the Gateway Load Balancer endpoint. This reduces risk and assures security/networking teams and auditors that nobody working in the accounts can bypass Check Point appliances.\nEnhanced security outcomes The above solution ensures all traffic, specifically confidential information, is internally routed and doesn’t leave the secure network. Any traffic ingress and egress passes through the Check Point appliances, securing the data within the AWS network.\nNetwork engineers, security teams and compliance auditors are given a centralised view and greater visibility of the deployed AWS network. The set-up permits any specific custom data monitoring elements using the Check Point appliances. With additional custom routing on TGW, you can also force traffic between VPCs to pass through the Check Point appliances. This adds a further level of network segregation, which can reduce the risk of data exfiltration between accounts and environments.\nAn additional benefit is that the Check Point appliance service works like any on-premises version. This reduces the need to re-skill teams to use a different technology stack and facilitates integration with existing management services. It also means the same solution can work with other networking appliances that support the GENEVE protocol, extending the range of services that align with the organisation’s existing skillsets.\nPutting the solution to the test The solution design itself was deployed to a financial services customer that had specific requirements around both ISO27001 compliance, and their own internal security teams. As with all customers you work with, there will always be requirements which do not fit a \u0026ldquo;standard\u0026rdquo; pattern of deployment, or services required. While there are many AWS services that could be used that are managed by AWS through the shared responsibility model, there are times when these might not meet a specific set of requirements. For our customer at Expert Thinking, we had to ensure that pre-existing administration of the Check Point appliances hosted in-house was not separated from the management of network security within AWS. Using the AWS Gateway Load Balancer to our advantage, we were able to use the Check Point Cloud Guard appliances instead.\nBy using this solution, we were still able to meet the checks within the AWS Well-Architected Framework for section 5 of the Security Pillar. Specifically the sections on Controlling traffic at all layers and Automating network protection. This was achieved through the use of the centralised Check Point Management system across the customer\u0026rsquo;s networking estate, and it feeing in key information to the Check Point Appliances within AWS, so that the pre-existing automations for network security were achieved, including across all of the layers.\nOne of the great ideals behind the Well-Architected Framework, is that it is not a check-list to use specific AWS services, but a checklist of concepts that if you follow, will ensure you are Well-Architected. Automating network protection could be using AWS WAF, or AWS CloudFront, as well as also using a managed firewall, or a firewall that you have installed on an EC2 instance that you run automation scripts on. Look at the risk behind each option, and if you have removed the risk, or accepted the risk with mitigations in place, then you will continue to be Well-Architected.\nExpert Thinking\u0026rsquo;s Well-Architected Assessment We at Expert Thinking used our own AWS Well-Architected Assessment throughout the process of delivering for this customer. The review is not just a single check that you leave to the side, but a continual check throughout the whole delivery process. Expert Thinking ran the AWS Well-Architected Review at the discovery phase of our journey, understanding the current security requirements and potential high risk elements that we would need to remediate. Working with the customer during the design and implementation phases, we continually checked in with the Well-Architected Framework to ensure that the work we had been delivering still met with the standards set within each of the pillars.\nThe Six Pillars of the AWS Well-Architected Framework Once we had completed the delivery for the customer, a final review was performed by Expert Thinking\u0026rsquo;s consultants to once again show that we had delivered the security requirements set by both the customer and the Well-Architected Framework. In doing so, we continued to boost the best practices for our financial services customer.\nIf you wish to discuss with Expert Thinking running an AWS Well-Architected Assessment on your estate, then please reach out to our wonderful team who will be able to set you along the path of continued best practices.\n","date":"2023-05-02T18:11:45Z","image":"https://static.colinbarker.me.uk/img/blog/2023/05/alicja-ziajowska-AOjmfr3ofSY-unsplash.jpg","permalink":"https://colinbarker.me.uk/blog/2023-05-02-boosting-cloud-best-practice-for-financial-services/","title":"Boosting cloud best practice for financial services organisations"},{"content":" Header photo by Christopher Bill on Unsplash\nWhat can be done? Either scenario can result in decision makers and business managers getting the wrong impression of cloud computing, and potentially missing out on major business benefits. To avoid falling into this trap, a more considered approach to TCO calculation is needed. A detailed look at likely repercussions of migration can reveal both positive and negative outcomes, which enables more informed decisions.\nAs the AWS Practice Lead at Expert Thinking, I support and guide clients at every step of their cloud migration journey. Over the years, I’ve found that five key factors are often overlooked during an organisation’s TCO calculation phase. When they’re acknowledged, the result is a far more sophisticated estimation that goes beyond immediate transactional costs to build a longer-term picture.\nThe five factors that hinder TCO calculations Like-for-like comparisons are rarely accurate At a basic level, cloud migration TCO estimates look at the cost of hosting an instance in the original environment versus hosting the same instance in the cloud. AWS provides a great tool for this initial comparison with its Migration Evaluator.\nHowever, things are never quite as simple as a like-for-like comparison might suggest. It should be treated as the first step, giving an initial indication of TCO ahead of further analysis and consideration of the wider context.\nFor instance, when migrating from a datacentre to the cloud, the depreciation of equipment value should be factored in. If the hardware CAPEX was expected to span five years, and cloud migration happens after two, that cost will remain on the balance sheet for another three years. What’s more, the cloud costs will include an allocation for management overhead and network bandwidth in addition to hosting.\nOn the face of it, these factors could make costs following cloud migration seem disproportionately high. But the story doesn’t end there.\nShared responsibility is often misunderstood A central aspect of AWS’ cloud offering is the shared responsibility model. It means AWS handles the operation, management and control of components from the operating system and virtualisation layer as well as the physical security of facilities.\nSo, while at first glance it may look as though monthly costs are higher with AWS, this doesn’t account for the reduced operational burden that shared responsibility brings. Relieving people from the ongoing management of underlying hardware unlocks new potential for staff productivity and business agility. Both of these elements play a critical role in deriving value from cloud computing, as per the AWS Cloud Value Framework.\nFor instance, time previously spent on day-to-day security and compliance matters can be diverted to application development or operational enhancements. This is a golden opportunity to reskill or upskill staff, boosting team morale. It ultimately leads to cumulative benefits as people make improvements and pass new learnings on to colleagues.\nThe impact of past incidents and outages is forgotten Like-for-like comparisons also ignore the fact that historic performance and reliability issues can be mitigated in the new environment through improved operational resilience.\nBased on conversations we’ve had with hundreds of medium and large firms, it’s not uncommon for costs associated with a single outage to exceed £100k. When teams are too busy to properly identify or resolve underlying issues, they simply patch the platform up and the problems continue. It’s important to factor costs associated with platform instability into the TCO equation since many issues can be alleviated in the cloud environment.\nThe benefits of cloud-based business continuity can be harnessed from the outset of the migration process. For instance, AWS’ CloudEndure Disaster Recovery is an excellent tool that can be implemented across physical, virtual and cloud servers alike. We regularly use it on a short-term basis to protect customers’ existing environments during transformative migrations. It means that while time and energy is focused on building the new environment, the risk of suffering an outage on the old one is greatly reduced. CloudEndure can also be used as a cloud migration tool in situations where an automated lift-and-shift is appropriate.\nSoftware licences get overlooked As with hardware, software licences need to be included in TCO estimates as not all software solutions are cloud friendly. Older licenses may be held on physical CPUs or per cores, and others are simply not portable to the cloud. AWS offers ‘Bring Your Own Licences’ (BYOL) options for Windows Server Licences via EC2 instances. However, there may be additional licences that cannot be used in the new environment, potentially resulting in cost duplication.\nTaking steps to understand the impact of cloud migration on software licence costs at an early stage is critical. Conducting an AWS Migration Readiness Assessment is an effective way to achieve this. The assessment identifies systems that are reliant on software and triggers questions related to licence Ts\u0026amp;Cs. From here, decision makers can make an informed judgement surrounding total costs of hardware, software and licences in the original environment compared to an AWS PAYG model.\nExisting contractual obligations go unnoticed Multiyear contracts for services such as rack rental are another factor that’s often forgotten in like-for-like comparison. Decommissioning racks before the end of a contract can result in penalty charges that should be included in the TCO calculation. It’s useful to prepare a burndown chart of costs for existing contracted services to provide clarity on the matter.\nFor organisations that prefer to work with a long-term contract fee structure, AWS can offer multi-year deals that may offset some of the costs associated with early termination penalties. Options include Amazon EC2 Reserved Instances and Savings Plans, both of which are up to 72% cheaper than on-demand instance pricing. Take a broad perspective\nIn Summary TCO calculations need to look beyond the initial transaction. They should consider the bigger picture of costs involved with migrating to the cloud versus remaining in the current environment. Even then, the true value of cloud computing extends beyond potential cost savings. It unlocks new ways of working that boost staff productivity, operational resilience and business agility. The combined benefits of these improvements can have a transformative impact on business culture and performance. Cloud migration is an opportunity to reposition any business as dynamic, customer-focused and value-driven, rather than a slow-moving, cost-led entity.\n","date":"2023-04-19T12:50:19Z","image":"https://static.colinbarker.me.uk/img/blog/2023/04/christopher-bill-rrTRZdCu7No-unsplash.jpg","permalink":"https://colinbarker.me.uk/blog/2023-04-19-common-mistakes-that-skew-cloud-migration-tco-estimates/","title":"Five common mistakes that skew cloud migration TCO estimates"},{"content":" Header photo by NASA on Unsplash\n⚠️ Note: This is Part 2 of my IPv6 on AWS series, the first part is available here. ⚠️\nWhat is a \u0026ldquo;pet\u0026rdquo; instance, and why? Before I start, this is an exceptional use case. When using AWS you should always use ephemeral instances, it is always best practice. There are times however, you might need to do this to a singular instance for any number of very valid reasons. In my case, I have a very cheap proxy service between the outside world any my home network. I use this external service as a very cheap method to enable access to some systems that can be accessed through a Site-to-Site VPN to my home. You might have other reasons too, for example - Microsoft Active Directory servers which are hosted on EC2 instances would be considered \u0026ldquo;pet\u0026rdquo;.\nIn my case, I have this one static instance using an Elastic IPv4 IP, and I would like to give it an IPv6 address.\nPets vs Cattle Analogy cattle, not pets\n― Bill Baker, Twitter This phase used during a presentation about Scaling SQL Servers way back in 2006, to show how attitudes towards computing evolved since the early days.\nIt describes the idea that servers can be two types:\nPets: These are your pride and joys, there is just one Oz the Cat that sits on your lap while writing blog posts about IPv6, you look after them, nurture them, and you deal with everything that comes their way as and when it happens. Cattle: You have a farm, you have a vast amount of animals that help you produce several products that you sell to the market. If one of those animals becomes an issue, you \u0026ldquo;replace\u0026rdquo; them. No sentimental attachment. I will say, this analogy will probably not last the test of time - the latter \u0026ldquo;Cattle\u0026rdquo; explanation can upset a few people for different reasons and you probably will see this change in the future. Hoping for something like \u0026ldquo;Cooker vs Food\u0026rdquo; or \u0026ldquo;House vs Tent\u0026rdquo;, but I can\u0026rsquo;t see that happening any time soon!\nSo with that in mind, you can see the following:\nPet Instances: They have a set hostname, they have had loads of love, care, and attention given to them. When they go wrong, you investigate, identify the issues, remediate, and bring back to health. They also make lots of noises when you need to give them attention. (Yes, Oz is meowing at me at the moment!) Cattle Instances: They have a randomly generated unique identifier, you keep the safe and secure, but once the instance starts to fail, you quickly take them out of the loop, and replace with a healthy instance. The failed instance, you just get rid of. Starting Point Before we begin, we need to set the scene a little. What we will be working with is incredibly simple - an EC2 instance sitting in a Public Subnet.\nVery basic setup of an EC2 instance in a public subnet You can find the code to deploy the above diagram on my GitHub - IPv6 on AWS repo. Feel free to follow along in your own sandbox account if you wish!\nThe main block of code we need to start looking at is the ec2_instance resource, there are some comments inline to explain what we are doing.\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 # Build the EC2 instance using Amazon Linux 2 resource \u0026#34;aws_instance\u0026#34; \u0026#34;test\u0026#34; { ami = data.aws_ami.amazon_linux_2.id # Dynamically chosen Amazon Linux AMI ebs_optimized = true # EBS Optimised instance instance_type = \u0026#34;t4g.nano\u0026#34; # Using a Graviton based instance here disable_api_termination = true # Always good practice to stop pet instances being terminated # Networking settings, setting the private IP to the 10th IP in the subnet, and attaching to the right SG and Subnets source_dest_check = false private_ip = cidrhost(aws_subnet.public_a.cidr_block, 10) subnet_id = aws_subnet.public_a.id vpc_security_group_ids = [aws_security_group.test_sg.id] # This requires that the metadata endpoint on the instance uses the new IMDSv2 secure endpoint metadata_options { http_endpoint = \u0026#34;enabled\u0026#34; http_tokens = \u0026#34;required\u0026#34; } # Sets the size of the EBS root volume attached to the instance root_block_device { volume_size = \u0026#34;8\u0026#34; # In GB volume_type = \u0026#34;gp3\u0026#34; # Volume Type encrypted = true # Always best practice to encrypt delete_on_termination = true # Make sure that the volume is deleted on termination } # Name of the instance, for the console tags = { \u0026#34;Name\u0026#34; = \u0026#34;Sample-EC2-Instance\u0026#34; } # Ensures the Internet Gateway has been setup before deploying the instance depends_on = [ aws_internet_gateway.sample_igw ] } Nice and simple really, but this is where simple can cause an issue in this case.\nThe Issue Without going to too much detail, as you know Terraform uses a State that will store details about the environment it manages, including the settings and configuration of resources. It will use this information to keep track of what it knows, and what has changed - giving it the great ability to see what will change during the next application of your code. This state is generated from the outputs of the providers API\u0026rsquo;s and your code.\nHowever, the API call for creating an EC2 instance does more than just create a single EC2 resource. This also happens with multiple services, and multiple cloud providers as well, so it isn\u0026rsquo;t specific to AWS on this case. The single API call makes the process a lot simpler to build up the compute for you, but it includes one additional resource: The Elastic Network Interface.\nIn typical use, this is great, it saves you building instances with no networking and then having to mess about with it later on. I would also point out, that a cloud service with no networking access, is no different to getting a physical computer, disconnecting everything except the power, wrapping it in concrete, and then seeing what you can do. I mean it will still run (albeit very hot and probably melt), but you can\u0026rsquo;t then do anything with it, or see what is happening!\nEC2 instance showing the Elastic Network Interface (ENI) attached Terraform does have a resource for the Elastic Network Interface (ENI) however, as the API created this resource itself, attached it to the instance using the parameters specified in the ec2_instance resource, Terraform doesn\u0026rsquo;t know about this. Herein lies the issue.\nAdding an IPv6 Address using the CLI Let\u0026rsquo;s say you don\u0026rsquo;t have your Infrastructure defined as code and you needed to assign an IPv6 address to your instance, thankfully the awscli has such a method. Under the ec2 service type, there is a command to assign-ipv6-addresses (Documentation)\nTo do this, you would run the following command:\naws ec2 assign-ipv6-addresses --ipv6-addresses \u0026lt;address\u0026gt; --network-interface-id eni-007ed4d597d6df6b7\nThe one item you need to know is, the network-interface-id. While the ec2_instance resource, does have this as an output, the only way to get this is after the resource has been created.\nWhat happens in Terraform? When using the AWS EC2 API to create the instance, behind the scenes it will create that interface using the settings, and return the Interface ID. As the API\u0026rsquo;s are not supposed to keep the State themselves, it will always consider a change to those settings in the creation block, as a new interface/network settings.\nAs we define the network settings in our resource for the EC2 instance, Terraform knows there is a change to the state, talks to the AWS EC2 API, that then will require the network interface to be recreated. The only way to do that would be create a new interface, detach the old interface, attach the new interface, and you are done, except, that isn\u0026rsquo;t possible. AWS Documentation on Network Interfaces state that \u0026ldquo;Each instance has a default network interface, called the primary network interface. You cannot detach a primary network interface from an instance.\u0026rdquo; This comes into play with my workaround in this blog later on. Therefore, the only option, is to re-create the instance.\nIf we were to simply use the ipv6_addresses in the ec2_instance block, to add a new IPv6 address, Terraform will report the following:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: -/+ destroy and then create replacement Terraform will perform the following actions: # aws_instance.test must be replaced -/+ resource \u0026#34;aws_instance\u0026#34; \u0026#34;test\u0026#34; { \u0026lt;\u0026lt;SNIP\u0026gt;\u0026gt; ~ instance_initiated_shutdown_behavior = \u0026#34;stop\u0026#34; -\u0026gt; (known after apply) ~ instance_state = \u0026#34;running\u0026#34; -\u0026gt; (known after apply) ~ ipv6_address_count = 0 -\u0026gt; (known after apply) ~ ipv6_addresses = [ # forces replacement + \u0026#34;2a05:d01c:b90:ee00::10\u0026#34;, ] + key_name = (known after apply) ~ monitoring = false -\u0026gt; (known after apply) + outpost_arn = (known after apply) \u0026lt;\u0026lt;SNIP\u0026gt;\u0026gt; } As you can see, adding the address forces replacement. Which for our wonderful pet instance here, is a terrifying thought!\nWorkaround - Pseudo Code ⚠️ Note: This is not going to be the best way, in all honesty this is a hack more than a workaround, but it does make it easier to bypass the issue. Always consider backups before doing this, always consider that if you need to do this, why do you need to do it? ⚠️\nGiven that the AWS CLI can add an IPv6 address without rebuilding the instance, as well as the console as well, it does mean that a replacement isn\u0026rsquo;t needed to add the address on. For this, we have to play around a bit with Terraform, run it a few times, to get everything up in sync. To this we will need to:\nCreate an aws_network_interface resource to match the primary ENI with the IPv6 address Import the ENI resource into the Terraform State (requires CLI) Apply the changes using Terraform, and confirm the IPv6 address has been attached Add the ipv6_addresses list to the ec2_instance resource block Delete the aws_network_interface from the Terraform State (requires CLI) Delete the aws_network_interface resource block from the code Run a plan to confirm everything is working As you can see, there are a few steps to do this, but it does mean that if you ever need to re-deploy the code, this will create an identical copy of the instance (minus data) and matches the code.\nThis also removes the issue later on if you ever wish to destroy the code. If you were to stop after the Apply when you have the IPv6 address assigned, you will now have an aws_network_interface resource block managed by Terraform, but also managed by the EC2 Instance itself. As mentioned before, you cant remove the primary network interface from the EC2 instance, and if you try and run a destroy, Terraform will attempt to call the API to tell AWS to delete the interface, an come back with a 400 error.\nError: detaching EC2 Network Interface (eni-071f55452ccc18997/eni-attach-0a424f74a43a7a0af): OperationNotPermitted: The network interface at device index 0 cannot be detached.\nSo the workaround requires us to complete all the steps to make this work.\nWorkaround - Terraform Create the ENI First we need to create the aws_network_interface resource block. We will need to try and match as much as we can to what we have defined in the ec2_instance block. Below is an example of of this block created using the sample code above.\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 resource \u0026#34;aws_network_interface\u0026#34; \u0026#34;test_eni\u0026#34; { subnet_id = aws_subnet.public_a.id # Uses the same IP from the ec2_instance resource private_ips = [cidrhost(aws_subnet.public_a.cidr_block, 10)] # The new IPv6 to assign - note that 16 in hexadecimal is 10 ipv6_addresses = [cidrhost(aws_subnet.public_a.ipv6_cidr_block, 16)] # Same security group from the ec2_instance resource security_groups = [aws_security_group.test_sg.id] # Continue with additional settings from the ec2_instance resource source_dest_check = false # Match the attachment details on the EC2 instance attachment { instance = aws_instance.test.id device_index = 0 } } Import the ENI into State Next we need to get the ENI information into the state, to do this we need the network interface ID from AWS. This can be done through the console, or by looking at the State file (if you can). In our example, the ENI is eni-071f55452ccc18997. Terraform at the bottom of all of its documentation will have the command you need to import the resource in, in our case we will need to import this ENI into this resource block\nterraform import aws_network_interface.test_eni eni-071f55452ccc18997\nApply the changes As we have already created the block with the IPv6 address in it, we should be able to run the plan and apply to add the IPv6 address. Do remember to check your plan! Make sure that it is doing what you expect, in my case the plan showed:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: ~ update in-place Terraform will perform the following actions: # aws_network_interface.test_eni will be updated in-place ~ resource \u0026#34;aws_network_interface\u0026#34; \u0026#34;test_eni\u0026#34; { id = \u0026#34;eni-071f55452ccc18997\u0026#34; ~ ipv6_address_count = 0 -\u0026gt; (known after apply) ~ ipv6_address_list = [] -\u0026gt; (known after apply) + ipv6_address_list_enabled = false ~ ipv6_addresses = [ + \u0026#34;2a05:d01c:b90:ee00::10\u0026#34;, ] + private_ip_list_enabled = false tags = {} ~ tags_all = { + \u0026#34;Environment\u0026#34; = \u0026#34;Sandbox\u0026#34; + \u0026#34;Source\u0026#34; = \u0026#34;Terraform\u0026#34; } # (15 unchanged attributes hidden) # (1 unchanged block hidden) } As shown, the only major change will be that there is an additional IPv6 address assigned to it, but also my default tags are added too.\nThe EC2 instance now has an IPv6 address associated with it Technically at this point, we have done it - but we still have to sort out our Terraform to prevent future issues.\nAdd the IPv6 address to the instance block Thankfully the block we created, also has the same line that we can copy back into the aws_instance block, we can sneak this back in to match the state in AWS\u0026gt;\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # Build the EC2 instance using Amazon Linux 2 resource \u0026#34;aws_instance\u0026#34; \u0026#34;test\u0026#34; { \u0026lt;\u0026lt;SNIP\u0026gt;\u0026gt; # Networking settings, setting the private IP to the 10th IP in the subnet, and attaching to the right SG and Subnets source_dest_check = false private_ip = cidrhost(aws_subnet.public_a.cidr_block, 10) # IPv6 Address for the instance from the aws_network_interface block ipv6_addresses = [cidrhost(aws_subnet.public_a.ipv6_cidr_block, 16)] subnet_id = aws_subnet.public_a.id vpc_security_group_ids = [aws_security_group.test_sg.id] \u0026lt;\u0026lt;SNIP\u0026gt;\u0026gt; } Delete the ENI from the State Before we remove the block, we need to remove the information from the State. This will prevent us from accidentally deleting the block, running an apply, and getting the 400 error mentioned before! To do this we will need to run the following command:\nterraform state rm aws_network_interface.test_eni\nNow you can delete the whole aws_network_interface block from your code.\nRun a plan to check With all this complete, we just need to run one very last terraform plan to confirm, if everything has gone right then your output should be:\n1 2 3 No changes. Your infrastructure matches the configuration. Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed. Summary Just would like to point out, this is not the ideal way of doing this. I am sure there are other ways, but this is how I got around the issue. Thankfully in my case I had access to the Terraform State, which made the addition of the aws_network_interface a lot easier. This workaround doesn\u0026rsquo;t work too well when you do not have access.\nThis technique can work not just on AWS, but other cloud providers too - its mainly about the logical steps of importing the unmanaged resource, making the changes, and then releasing it back to AWS to manage.\nHowever, we are reminded here why pet instances are just that, sometimes they can be a bit of a pain, but you nurture them for a reason! In my case, I am just a little bit lazy, and didn\u0026rsquo;t want to have to set everything up again!\nHopefully I will continue this IPv6 series soon, where I will go over a number of other services - to see if we can\u0026rsquo;t push forward with the IPv6 transition.\nAny comments or queries will be greatly appreciated!\nOz The Cat For anyone that is wondering, this is Oz, he is a lovely cat!\nThis is Oz! ","date":"2023-03-04T14:26:18Z","image":"https://static.colinbarker.me.uk/img/blog/2023/02/nasa-Q1p7bh3SHj8-unsplash.jpg","permalink":"https://colinbarker.me.uk/blog/2023-03-04-enabling-ipv6-on-aws-using-terraform-ec2-part-2/","title":"Enabling IPv6 on AWS using Terraform - EC2 \"Pet\" Instance (Part 2)"},{"content":"Not a long post, just one for me to say that I am thrilled to have been accepted as an AWS Community Builder! It had been on my list for a while, but need a bit more confidence to really apply! I have been accepted on the Networking and Content Delivery category.\nTo find out more about the program head over to the AWS Community Builders site!\nSmall shout out to my fellow AWS SMEs for encouraging me to apply, and especially Jimmy Dahlqvist for being an awesome mentor through this process!\n","date":"2023-03-02T11:10:16Z","image":"https://static.colinbarker.me.uk/img/blog/2023/03/AWSCommunityBuilder-ColinBanner.png","permalink":"https://colinbarker.me.uk/blog/2023-03-02-aws-community-builder-colin-barker/","title":"AWS Community Builders Program - Accepted!"},{"content":" Header photo by Roman Synkevych 🇺🇦 on Unsplash\nEdited: 12th Jan 2025 - The AWS SDK and the Terraform AWS Provider was updated to make the thumbprint optional! See my latest blog post here for more information.\nEdited: 18th Sept 2024 - As AWS no longer require you to use a thumbprint when setting up the OIDC connection the thumbprint section is no longer needed, however the Terraform Provider still has an open bug for this. Hopefully, once this issue is closed I will remove the section completely!\nWhat is OpenID? Always a good start, understanding what the key component to this whole post is! As always though, I will reference Wikipedia\nOpenID is an open standard and decentralized authentication protocol promoted by the non-profit OpenID Foundation. It allows users to be authenticated by co-operating sites (known as relying parties, or RP) using a third-party identity provider (IDP) service\u0026hellip;\n― Wikipedia, OpenID Simple right! Well the technical details of how OpenID works is probably for a much more in-depth specific technical blog for this technology, but the one thing that we need to understand here is that OpenID allows users to be authenticated using 3rd party identify providers. In the case of AWS, their OpenID Connect set up would allow a service in GitHub to authenticate to AWS, and through the IAM system, assume a specific role.\nThe question is, why go to the effort to set this all up when a simple Access Key/Secret Key combination would work? Well, as you know from AWS Well-Architected best practices, you should always use temporary credentials over static ones. With the assumption of an AWS role, it uses temporary credentials. With OpenID Connect-compatible identity providers, such as GitHub, you would need to set this up using a Web Identity source. With this post, I will show you how I set this up for this blog (and for the Tokonatsu website!)\nGitHub Actions using IAM Access Keys This is where I started, as you can see below GitHub has my secrets for the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, statically set quite a while ago!\nIAM Access Key and Secret Key statically used Within my GitHub actions pipeline, I am using the Configure AWS Credentials action from the GitHub Marketplace to configure the secrets for use in the pipeline.\n1 2 3 4 5 6 7 - name: Configure AWS Credentials id: aws-credentials-configure uses: aws-actions/configure-aws-credentials@v1 with: aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} aws-region: ${{ secrets.AWS_DEFAULT_REGION }} If you want to see the whole file, then there is a link here to the build-and-deploy.yaml (prior to the changes) file on GitHub.\nAs you can see, it is pretty simple, and it worked. The IAM user that owned those keys was happy to sit there and allow the pipeline to access the service. However, that IAM user is still static, and the keys will need to be manually be rotated. From a security stance, anyone that found that key out would be able to do as much as the pipeline could do to my website. As I use aws s3 sync to copy the website up, while also using the -delete parameter, it means that I need delete access for this IAM user! Not really the best!\nBy using Terraform, I was able to set up all the correct access and switch my pipeline over to using role assumption, thus temporary credentials. GitHub do provide a walkthrough to set up the OpenID Connect, which is what I based this configuration on. Along with the Terraform Documentation hopefully, this will help you with your journey!\nCreation of the OpenID Connect Provider Setting up the Identity Provider (IdP) will need to be the first step. This action will create a description in IAM of the external IdP and establishes the trust between your account and the organisation, in this case GitHub. This step requires just a few options, of which can be harder to get.\nUsing the walkthrough documentation, we can see that the following is required:\nThe provider URL - in the case of GitHub this is https://token.actions.githubusercontent.com The \u0026ldquo;Audience\u0026rdquo; - which scopes what can use this. Confusingly in Terrraform this is also known as the client_id_list. The Thumbprint of the endpoint - This one is the tricker one, as you will need to generate this yourself. Generating the thumbprint ⚠️ This section is no longer required - AWS and GitHub no longer require the setting of thumbprints between them however, the Terraform Provider for AWS still has an open bug as it still requires a value to enter this in. For the best option, just enter in the value 00000000000000000000000000000000000000000.\nThis part wasn\u0026rsquo;t as clear in the GitHub documentation, but I went over to the AWS Documentation which gave me instructions on how to generate the thumbprint. You would need a copy of the openssl CLI o be able to do this, but the quickest way is as follows:\nUse the OpenSSL command to check against the provider URL to get the certificate. 1 openssl s_client -servername token.actions.githubusercontent.com -showcerts -connect token.actions.githubusercontent.com:443 Grab the certificate shown in the output, you will see this starting with -----BEGIN CERTIFICATE-----, then place this content into a file. For this demo, I will use github_openid.crt.\nUse the OpenSSL command again to generate the fingerprint from the file created above.\n1 openssl x509 -in github_openid.crt -fingerprint -sha1 -noout Which should output the fingerprint. Strip away all the extra parts, and the : between each of the pairs of hexadecimal characters, and you should end up with something like this:\n1 6938fd4d98bab03faadb97b34396831e3780aea1 ⚠️ Note: You will need to make the letters lowercase, as Terraform is case sensitive for the variable we need to put this in, but AWS is not case sensitive, so it can sent Terraform into a bit of a loop. ⚠️\nAdding the resource in Terraform With all of this, you can now start to put together the Terraform elements. We can use the Terraform resource for setting up the OpenID Connect Provider in IAM. Below is a code example, using the information gathered from the documentation and the thumbprint generation, and all placed into a single resource object.\n1 2 3 4 5 6 7 8 9 resource \u0026#34;aws_iam_openid_connect_provider\u0026#34; \u0026#34;github\u0026#34; { url = \u0026#34;https://token.actions.githubusercontent.com\u0026#34; client_id_list = [ \u0026#34;sts.amazonaws.com\u0026#34; ] thumbprint_list = [ \u0026#34;6938fd4d98bab03faadb97b34396831e3780aea1\u0026#34; ] } This seems simple enough, but this is just authorising GitHub to be a trusted source, and that identities from GitHub can be used to authenticate against AWS. The IAM role set up is where the main bulk of granting access is completed.\nCreating the IAM Role IAM Policy - Bucket Access Before access can be granted to the GitHub Actions pipeline, we will need to create a policy that defines the access that the role will have. There isn\u0026rsquo;t much in the way of any difference to this part than creating any other policy however, for this blog and the Tokonatsu website, we need some additional permissions outside of the simple s3:PutObject.\nIn the example below, as my S3 bucket is also a resource in Terraform, you can see how I have pulled the bucket ARN for the resource from the outputs of the S3 bucket creation. It is normally best practice to reference variables and other outputs rather than using strings.\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # tfsec:ignore:aws-iam-no-policy-wildcards data \u0026#34;aws_iam_policy_document\u0026#34; \u0026#34;website_colins_blog_policy\u0026#34; { version = \u0026#34;2012-10-17\u0026#34; statement { effect = \u0026#34;Allow\u0026#34; resources = [ aws_s3_bucket.website_colins_blog.arn, # or \u0026#34;unique-bucket-name-for-site\u0026#34; \u0026#34;${aws_s3_bucket.website_colins_blog.arn}/*\u0026#34; # or \u0026#34;unique-bucket-name-for-site/*\u0026#34; ] actions = [ \u0026#34;s3:DeleteObject\u0026#34;, \u0026#34;s3:GetBucketLocation\u0026#34;, \u0026#34;s3:GetObject\u0026#34;, \u0026#34;s3:ListBucket\u0026#34;, \u0026#34;s3:PutObject\u0026#34; ] } } ⚠️ Note: As I use tfsec to keep an eye on my Terraform, it attempts to look at the policy and look for anything that might be considered an issue. On the first line you can see the tfsec:ignore:aws-iam-no-policy-wildcards comment, which means tfsec will ignore that rule when it checks my Terraform. As we need to give this role access to do the listed actions on all objects in the bucket, a wildcard is easier. Hence the rule to stop the error from showing up. ⚠️\nIAM Policy - Role Assumption As we will be using rule assumption, there needs to be an additional policy document created that will be added to the Role, that can tell it who can assume the role. You might have seen a similar version when using sts:AssumeRole as an action, for principles that cover other accounts as an example. With the sts:AssumeRoleWithWebIdentity what we are telling AWS tha the role assumption needs to happen if they have a \u0026ldquo;web identity\u0026rdquo;, one of the external providers.\nThe primary element to ensuring the right IdP is used when looking for which identity to grant access too, we need to reference the ARN of the OpenID Connect Provider added earlier. As the Terraform resource we used above was identified with aws_iam_openid_connect_provider.github, we can use one of its attributes to programmatically add the ARN to the principles. We will need to specify the type in Terraform as \u0026ldquo;Federated\u0026rdquo; as well.\nThe last two bits are condition blocks and these are unique to the GitHub OpenID Connect set up. There is a little more detail on the GitHub Actions: OpenID Connect in AWS page. Using this page, I am adding two \u0026ldquo;StringLike\u0026rdquo; tests to the role, that looks for two specific variables:\ntoken.actions.githubusercontent.com:sub - Which is used to specify which repo\u0026rsquo;s GitHub actions are allowed access. token.actions.githubusercontent.com:aud - That ensures that AWS\u0026rsquo;s STS service is the one that is requesting the identity type, and no others. The token.actions.githubusercontent.com:sub in my actual repo it says that any repo in my personal space mystcb called blog using any branch, can access. While this is very open, my personal blog only really can be edited by myself, and I would want the role to also activate on any test branches as well. This is shown on my personal Terraform as repo:mystcb/blog:*. The example I have below is more specific, and only allows access if the GitHub action is working from the main branch. What I wanted to show here is a secure version, but also showing how you can add wildcards to the value to cover more branches if need be.\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 data \u0026#34;aws_iam_policy_document\u0026#34; \u0026#34;website_colins_blog_role_assumption\u0026#34; { version = \u0026#34;2012-10-17\u0026#34; statement { effect = \u0026#34;Allow\u0026#34; actions = [\u0026#34;sts:AssumeRoleWithWebIdentity\u0026#34;] condition { test = \u0026#34;StringLike\u0026#34; variable = \u0026#34;token.actions.githubusercontent.com:sub\u0026#34; values = [\u0026#34;repo:mystcb/blog:ref:refs/heads/main\u0026#34;] } condition { test = \u0026#34;StringLike\u0026#34; variable = \u0026#34;token.actions.githubusercontent.com:aud\u0026#34; values = [\u0026#34;sts.amazonaws.com\u0026#34;] } principals { type = \u0026#34;Federated\u0026#34; identifiers = [aws_iam_openid_connect_provider.github.arn] } } } Now we have our two policy documents as Terraform data objects, we can then pull them together to create the role\nCreation of the IAM Role For us to create the role, we need to pull together all the bits we have created so far. This will mean doing a few special tricks with the data resources.\nFirstly we need to create an IAM Policy resource, that the IAM Role resource can use to attach to the newly created role.\nThe IAM Policy resource require 3 elements, the name, path , and the policy in JSON format. While the name and path can be as custom as you need, the policy in JSON format is what might trip a few people up. The IAM Policy Document data object has just one Attribute output tha can be referenced here: json. This conversion means we can quickly add the three together to make the IAM Policy object in AWS.\n1 2 3 4 5 resource \u0026#34;aws_iam_policy\u0026#34; \u0026#34;website_colins_blog\u0026#34; { name = \u0026#34;access_to_website_colins_blog_s3\u0026#34; # This is my example name! path = \u0026#34;/\u0026#34; # Root path for ease policy = data.aws_iam_policy_document.website_colins_blog_policy.json } Next, we need to create the role itself which has two key elements, the name and the assume_role_policy. The name is as you want this to be however, the assume_role_policy will be needed to let AWS IAM know what can assume this role. In our case, it is the JSON output from our second IAM Policy Document.\n1 2 3 4 resource \u0026#34;aws_iam_role\u0026#34; \u0026#34;website_colins_blog_github_role\u0026#34; { name = \u0026#34;access_to_website_colins_blog_s3_role\u0026#34; # This is my example name! assume_role_policy = data.aws_iam_policy_document.website_colins_blog_role_assumption.json } Great! Now we have role, and we can assume it it - well not exactly, one last step. With IAM, you can attach multiple policies to a single role, which if you created in line with the aws_iam_role resource it can make it a little more complicated, and very long. The AWS provider allows us two methods to manage the role\u0026rsquo;s policies. One is through managed_policy_arns/inline_policy or aws_iam_policy_attachment. The former two work very much the same, but takes exclusive authority over the state of the IAM Role itself. This means if you attach policies using the latter resource object, you will find Terraform getting stuck in a cycle. For this example, I am using the policy attachment resource.\n1 2 3 4 resource \u0026#34;aws_iam_role_policy_attachment\u0026#34; \u0026#34;website_colins_blog\u0026#34; { role = aws_iam_role.website_colins_blog_github_role.name policy_arn = aws_iam_policy.website_colins_blog.arn } This block is where all the references from before make this easier. This is where the role resource object, and the policy resource object come together to create the role. Now, AWS is aware of GitHub, its OpenID Connect provider, and we have given a specific repo\u0026rsquo;s GitHub actions that run on the main branch access to assume a role in AWS, which give it access to AWS S3. The role assumption will use temporary credentials for each of the runs!\nOne last bit, this part will enable you to get the ARN for the role, which will be required for the configuration of GitHub.\n1 2 3 4 output \u0026#34;website_colins_blog_role_arn\u0026#34; { description = \u0026#34;The role ARN for the Website: Colin\u0026#39;s Blog Role\u0026#34; value = aws_iam_role.website_colins_blog_github_role.arn } Very simple, it just outputs the ARN for the role which will need to be copied to GitHub. For the rest of the blog, I am going to be using an example role in my examples. This will not work, so make sure you are getting your role that matches your account. The ARN we will be using will be\narn:aws:iam::12326264843:role/access_to_website_colins_blog_s3_role One IAM Role created, with federation to GitHub GitHub Actions/Workflow Updates To enable this, we need to update the workflow yaml file, and this is probably the easiest bit of the whole post!\nAdd the Role ARN as a secret This is where we need to move back to GitHub and grab the ARN from above, and add this as an add the URL to the Repository Secrets. You should be able to find your version at https://github.com/\u0026lt;yourname/org\u0026gt;/\u0026lt;yourrepo\u0026gt;/settings/secrets/actions. This is under Settings -\u0026gt; Secrets and variables -\u0026gt; Actions and click New Secret. Enter in a name for the secret, and its value which I have put in:\nName: AWS_ROLE_ARN Secret: arn:aws:iam::12326264843:role/access_to_website_colins_blog_s3_role Entering in the new secret with the role ARN Once you have added that, make sure to remove the two existing Repository Secrets, in my case I called them\nAWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY The AWS CLI will always use the keys over role assumption in it\u0026rsquo;s priority so always best to remove them. With the two older secrets removed you should now have just the AWS_ROLE_ARN and AWS_DEFAULT_REGION\nOnly the final two secrets left Update the Workflow YAML file For GitHub actions to be able to assume the role, there are two changes that need to be made to the workflow yaml file. The first one, will be the need to enable the workflow to interact with GitHub\u0026rsquo;s OIDC Token endpoint. Part of the assumption process will require us to identify as a web identity from GitHub to have AWS know who we are. As such you will need to add additional permissions to the job. Specifically the following\nid-token to write contents to read So it should look something like this:\n1 2 3 4 5 6 jobs: hugo_build_and_deploy: runs-on: ubuntu-latest permissions: id-token: write contents: read Later on in the pipeline, where we configured the AWS credentials before, you will need to remove the older secret variables, and put the new secrets in:\n1 2 3 4 5 6 - name: Configure AWS Credentials id: aws-credentials-configure uses: aws-actions/configure-aws-credentials@v1 with: role-to-assume: ${{ secrets.AWS_ROLE_ARN }} aws-region: ${{ secrets.AWS_DEFAULT_REGION }} To see this whole file in context, I would recommend having a look at this blog\u0026rsquo;s workflow on GitHub!\nAll you need to do now, is run the GitHub Actions workflow, and make sure it works! With my blog workflow, it was pretty much a drop in replacement for the IAM credentials. There were a few minor issues with my workflow, but nothing that following this wouldn\u0026rsquo;t have resolved fo me!\nSome minor issues As you can see, it wasn\u0026rsquo;t exactly first time running for me! It did take a while, and also I had placed a --acl public-read as part of the aws s3 sync command, which the new bucket I created had been set to block public ACLs!\nJust a few mistakes! There was one other issue, and that was with the GITHUB_TOKEN that is used. In normal operation, without the added permissions, this token worked fine with an additional Marketplace action called GitHub Deployments. However, changing this over to allow the OpenID Connect feature, it meant that the token mysteriously stopped working.\nSwitching to use a Fine-grained PAT On the 18th of October 2022, GitHub offered up a new service called Fine-grained personal access tokens. The idea is that rather than creating a very open Personal Access Token (PAT), you could create a token that was very limited in it\u0026rsquo;s reach. It is still in beta as I write this blog (28th Feb 2023).\nUsing this beta feature, I was able to create a new token, limiting it to specifically the blog repo, and specific permissions. The screenshot below shows the details about the new PAT. (I am aware I could probably reduce the permissions a little more!)\nA new Fine-grained Personal Access Token (beta) From here, I added a new respository secret called REPO_TOKEN with the value of the newly generated token, and then updated the part in the workflow that needed it:\n1 2 3 4 5 6 7 8 - name: Set Deploy Status id: deploy-status-start uses: bobheadxi/deployments@v1 with: step: start token: ${{ secrets.REPO_TOKEN }} env: Prod ref: ${{ github.ref }} Round up Hopefully what I have shown you in this post is how to move away from IAM Credentials, and use the OpenID Connect features of both AWS and GitHub to enable role based assumption to gain access to an S3 bucket that stores, in my case, a static website.\nIf you do have any questions, comments, please let me know!\n","date":"2023-02-28T16:29:20Z","image":"https://static.colinbarker.me.uk/img/blog/2023/02/roman-synkevych-wX2L8L-fGeA-unsplash.jpg","permalink":"https://colinbarker.me.uk/blog/2023-02-28-github-actions-using-aws-and-openid/","title":"GitHub Actions using AWS and OpenID"},{"content":" Header photo by NASA on Unsplash\nWhat is IPv6? According to Wikipedia, Internet Protocol version 6 (IPv6) was introduced in December 1995 (just over 25 years ago!), based on the original IPv4 protocol that we all know and love today. The Internet Engineering Task Force (IETF) developed this new protocol to help deal with the (at the time) anticipated exhaustion of the IPv4 address range. This process seemed like it could be simple enough, create a new system to replace and older system and enable the expansion of the Internet to meet modern day standards.\nThis is where the problem lie. Adopting IPv6 wasn\u0026rsquo;t as simple as just replacing IPv4. The two protocols are different enough that on top of physical hardware changes (older devices unable to support IPv6), it also meant a different way of thinking when it came to working with networking both on the Internet, and within (Intranets, local networks, home networks). One of the biggest issues that faced a lot of the world, is that ISP adoption rate was surprisingly low.\nHowever, as the IPv4 Exhaustion happened as early as 2011, providers have started very quickly adopting the new standard.\nSo, what is IPv6? I would recommend reading the Wikipedia for more details, as much of what I would write here, would essentially be copied just from that page! In summary though\u0026quot;\nIPv6 uses 128-bit addresses over the IPv4 32-bit addresses, allowing approximately 3.4 x 10^38 addresses, over IPv4\u0026rsquo;s 4,294,967,296 (2 x 10^32) Addresses are in 8 groups of hexadecimal digits, separated by colons. For example: fd42:cb::123 in short hand. (which would expand to be fd42:00cb:0000:0000:0000:0000:0000:0123) Route Aggregation is built into the addressing scheme allowing for the expansion of global route tables with minimal space used. Steve Leibson (in a now lost article) one said \u0026ldquo;[If] we could assign an IPv6 address to every atom on the surface of the earth, [we would] still have enough addresses left to do another 100+ Earths!\u0026rdquo; ⚠️ Note: I mentioned short hand above, and this comes with a lot of caveats but the primary rule is, you can drop any zero in an IPv6 address, and it is assumed, as long as it is before the hexadecimal character. For example 00cb:0000:0001 can be shortened to cb::1. However, you can only have ONE :: in each address (otherwise how does it know where to shorten it!), so for example 00cb:0000:0100:0000:0001 must be shortened to cb:0:100::1. I\u0026rsquo;ll go into this in more detail in a future post!⚠️\nThis is why it is important to start embracing IPv6, we have a lot more space to make lives easier for the world, and the only way we can ensure the continued adoption of the protocol is to enable this everywhere.\nIn this post, I will go through how you can enable, using Terraform, IPv6 on your existing AWS Cloud Networks. When planning IPv6, there is a lot to consider, and there are some architectural changes will need to be considered.\nSetting the scene - an existing AWS Cloud Network Basic AWS VPC with Networking This should be a very familiar layout to most people, an VPC in AWS with some very basic networking setup. In the diagram above, we have a VPC, with Public and Private Subnets in two availability zones. We have an Internet Gateway for the public subnets, and a NAT Gateway in each Availability Zone for the Private Subnets to talk to the internet. I have placed some sample IP addressing in, just for reference as part of this post.\nIf you wish to deploy this yourself, I have place some sample code on my GitHub that you can use. (https://github.com/mystcb/ipv6-on-aws/tree/main/01-sample-vpc)\nThroughout this post, you will see me mention the cost of running this using an estimate. I have been using for a while, a tool called infracost which is an open source (with subscription based additions) cost estimator tool - https://www.infracost.io/. For this demonstration, using the sample code listed above, it would cost an estimated $76.65/month - so if you don\u0026rsquo;t want rack up a bill, only deploy when you want to test, and use Terraform to destroy the services when you are done.\nAs an example, here it the cost estimate for this deployment:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 # infracost breakdown --path=. Evaluating Terraform directory at . ✔ Downloading Terraform modules ✔ Evaluating Terraform directory ✔ Retrieving cloud prices to calculate costs Project: mystcb/ipv6-on-aws/01-sample-vpc Name Monthly Qty Unit Monthly Cost aws_eip.natgwip[1] └─ IP address (if unused) 730 hours $3.65 aws_nat_gateway.sample_natgw_subnet_a ├─ NAT gateway 730 hours $36.50 └─ Data processed Monthly cost depends on usage: $0.05 per GB aws_nat_gateway.sample_natgw_subnet_b ├─ NAT gateway 730 hours $36.50 └─ Data processed Monthly cost depends on usage: $0.05 per GB OVERALL TOTAL $76.65 ────────────────────────────────── Remember, all costs are estimates! With the cloud, its pay as you use, utility based so the costs will be what you use. The above is just an estimate, so keep that in mind!\nIPv6 and AWS As mentioned before, there are some concepts that have to be considered when designing for IPv6. One specific concept for networking can seem a little confusing at first, but with the right security in place, will ensure that no accidental access to the service can happen.\nWithin AWS, IPv6 addresses are global unicast addresses. A unicast address is an address that identifies a unique object or node on a network. While the premise of a unicast address is not new (as it was the same in IPv6), with the exhaustion of IPv4 addresses, new methods to give unique addresses to multiple nodes was devised. Network Address Translation (NAT) was one such method for doing this. This allowed the mapping of a single unicast IP address to be used by multiple nodes, by routing traffic through the NAT node and allowing it to re-write the headers to make it seem like it had come from the unique address.\nNAT is a wide subject, and I am sure I will write more about it some day, but a real world example that you see in most places is your home network. You have multiple private nodes with access to the internet, typically using a single public unicast address.\nSo how does this relate to IPv6 and AWS? Remember earlier, I mentioned that there are so many addresses available in the IPv6 ecosystem that you could give a unique address to every atom on the planet? Well, we can do just that, but to the nodes that require addresses. This is what AWS does, each IPv6 address you assign to nodes in AWS, are global unicast addresses, unique to each node.\nThis means a \u0026ldquo;private\u0026rdquo; IPv6 Subnet does sound like it might be complicated to set up, but actually it isn\u0026rsquo;t as bad as you think! However, to start the process off, we will start with adding a IPv6 to the Public Subnets, to set the ground work, and go from there.\nEnabling IPv6 on your VPC Regardless of what you need IPv6 for, you need to enable this on your VPC. Just like your IPv4 CIDR that you assign to the VPC, you will need to assign a IPv6 CIDR to your VPC.\n⚠️ Difference: Private IPv6 addresses do exist, but you can\u0026rsquo;t assign them to AWS VPCs. You must use public CIDR ranges ⚠️\nIn a way, the above does make sense - every device is globally unique, so why would you need to make a private address. Personally, I use internal private addresses to make it easier to remember when connecting to servers, but I am very much an old-school person here, and you should be using name resolution to connect to instances!\nTherefore, when you go to assign an IPv6 CIDR range to your VPC, you have one of three options:\nIPAM-allocated Amazon-provided An IPv6 CIDR owned by you (BYOIP) For simplicity, I will be using the Amazon-provided CIDR ranges. In the future, I will go over the IPAM-allocated and BYOIP options, but for now these are just additional ways to get an IPv6 CIDR on your VPC.\nAWS have access to a fairly large range of IPv6 addresses, this means they can allocate you a unique set of addresses just for you.\n⚠️ Terminology: A Network Boarder Group is the name (chosen by AWS) that defines a unique set of Availability Zones or Local Zones where AWS advertises IP addresses ⚠️\nWhen requesting an IPv6 CIDR from AWS, they will need you to select a Network Border Group. This is to ensure that the IPv6 addresses you are receiving can be routed successfully to your VPC. Back to the IPv6 description above, to make routing simpler in the IPv6 world, AWS will route specific ranges to specific groups, and therefore you have to select the right group. Thankfully, as the groups are quite large, for the majority of regions - there is only a single Network Border Group, and Terraform will select this automatically!\nLets start with the vpv.tf that exists in the sample code (https://github.com/mystcb/ipv6-on-aws/blob/main/01-sample-vpc/vpc.tf).\n1 2 3 4 5 6 7 8 9 # Creation of the sample VPC for the region #tfsec:ignore:aws-ec2-require-vpc-flow-logs-for-all-vpcs resource \u0026#34;aws_vpc\u0026#34; \u0026#34;sample_vpc\u0026#34; { cidr_block = var.vpc_cidr tags = { \u0026#34;Name\u0026#34; = \u0026#34;Sample-VPC\u0026#34; } } Pretty simple, probably a little too simple! But keep in mind that this is just a sample for now!\nTerraform has two resource parameters that will be used to set and assign the IPv6 CIDR to the VPC. assign_generated_ipv6_cidr_block and ipv6_cidr_block_network_border_group. The border group is used a lot more with Local Zones, so we don\u0026rsquo;t need to worry about this at the moment, but do keep this in mind for more complex deployments.\nJust setting the assign_generated_ipv6_cidr_block to true is enough for AWS to assign the VPC a CIDR range. Terraform documentation.\nYour file should now look like this:\n1 2 3 4 5 6 7 8 9 10 # Creation of the sample VPC for the region #tfsec:ignore:aws-ec2-require-vpc-flow-logs-for-all-vpcs resource \u0026#34;aws_vpc\u0026#34; \u0026#34;sample_vpc\u0026#34; { cidr_block = var.vpc_cidr assign_generated_ipv6_cidr_block = true tags = { \u0026#34;Name\u0026#34; = \u0026#34;Sample-VPC\u0026#34; } } With this, your terraform plan should now look like this:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: ~ update in-place Terraform will perform the following actions: # aws_vpc.sample_vpc will be updated in-place ~ resource \u0026#34;aws_vpc\u0026#34; \u0026#34;sample_vpc\u0026#34; { ~ assign_generated_ipv6_cidr_block = false -\u0026gt; true id = \u0026#34;vpc-0123456789abcdef0\u0026#34; + ipv6_association_id = (known after apply) + ipv6_cidr_block = (known after apply) tags = { \u0026#34;Name\u0026#34; = \u0026#34;Sample-VPC\u0026#34; } # (16 unchanged attributes hidden) } Plan: 0 to add, 1 to change, 0 to destroy. As you can see with the plan, this will grab some additional details to add to the resource object as attributes to reference later on. This will be key to make your terraform portable, and not hard coded!\nOnce applied, this will then allocate the IPv6 CIDR to the VPC, and you should then see the following!\nConsole view of the CIDR allocations on a VPC showing the IPv6 allocation There we go, we have hit the first step! IPv6 is now enabled on the VPC! As you can see, we have received a /56 block of IP\u0026rsquo;s. That gives you the room to have a total of 4,722,366,482,869,645,500,000 hosts in your network. I don\u0026rsquo;t think we will be running out any time soon!\nNext step, lets assign to each of the subnets their own CIDR, so that resources in the subnets can assign their own IPv6 address.\nAdding IPv6 CIDRs to the public subnets Just like IPv4 CIDR\u0026rsquo;s, you can break down the IPv6 range you have into smaller ranges that are all routed internally using the AWS\u0026rsquo;s VPC networking. With a manual set up you would normally do something like this:\nIPv4 VPC CIDR: 192.168.0.0/20 Public Subnet in AZ1: 192.168.10.0/24 Public Subnet in AZ2: 192.168.11.0/24 This is shown in the sample VPC we are using in this post. Simple enough right? Breaking down the CIDR into smaller subnets, and then assigning them to the correct location. It is very much the same with IPv6, but the ranges are just that much larger, that sometimes its best to use an automated method for doing this. What everyone should be doing, is the automatic generation of the ranges for IPv4 as well, which is available in the example!\nTo do this, we can use a terraform function called cidrsubnet (https://developer.hashicorp.com/terraform/language/functions/cidrsubnet). This function can calculate the subnet addresses within a given IP network block or CIDR, and then be used as a value for a variable in the aws_subnet resource block.\nThis function takes a bit of getting used to, but here is my trick for understanding how it works!\nThe function looks like this:\n1 cidrsubnet(prefix, newbits, netnum) For further details, feel free to read the documentation above, but for our sample we will use the following:\nprefix the CIDR range. Available as an attribute as this is generated by AWS. aws_vpc.sample_vpc.ipv6_cidr_block newbits this is the number of additional bits you need to add to the CIDR prefix, to break the network down. In our case we will use 8 as it is a round number, but you will need to size this to your specifications. netnum this is the network number assigned to the broken down blocks that you will select for this subnet. It can\u0026rsquo;t be more than the newbits and you can\u0026rsquo;t use the same netnum on multiple subnets. The trick I have used is as follows. newbits is calculated as the difference between the CIDR\u0026rsquo;s prefix /56 and the network size you need. So in our case I would like to make each subnet a /64 in size. The difference between the two (64 - 56 = 8) means the bit difference is 8. For the moment, I will leave this here, but I will create an article in the future that explains how this works, and why its the number of bits!\nFor the netnum I tend to visualise it in the diagram - I wanted 4 subnets in my sample, (2 public, 2 private), so I am able to reference the networks created above by their counter, with the first network being 0, with the last network being 3 (as your counter started at 0).\nSo for our networks, we can enter the values and get the following:\n1 2 3 4 cidrsubnet(aws_vpc.sample_vpc.ipv6_cidr_block, 8, 0) # Subnet 1 - x:x::0:0/64 cidrsubnet(aws_vpc.sample_vpc.ipv6_cidr_block, 8, 1) # Subnet 2 - x:x::1:0/64 cidrsubnet(aws_vpc.sample_vpc.ipv6_cidr_block, 8, 2) # Subnet 3 - x:x::2:0/64 cidrsubnet(aws_vpc.sample_vpc.ipv6_cidr_block, 8, 3) # Subnet 4 - x:x::3:0/64 The next step is the parameter for the aws_subnet resource block, which is ipv6_cidr_block. Essentially the same as the cidr_block parameter, but for IPv6!\nSo for each of our public subnets, lets add this in. In our sample file vpc_subnets.tf (https://github.com/mystcb/ipv6-on-aws/blob/main/01-sample-vpc/vpc_subnets.tf), we have two public subnets, a and b. So lets make the change. Below is the example with the new parameters added:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 # Public Subnet A resource \u0026#34;aws_subnet\u0026#34; \u0026#34;public_a\u0026#34; { vpc_id = aws_vpc.sample_vpc.id ipv6_cidr_block = cidrsubnet(aws_vpc.sample_vpc.ipv6_cidr_block, 8, 0) cidr_block = cidrsubnet(aws_vpc.sample_vpc.cidr_block, 4, 10) availability_zone_id = data.aws_availability_zones.available.zone_ids[0] tags = { \u0026#34;Name\u0026#34; = \u0026#34;Public-Subnet-A\u0026#34; } } # Public Subnet B resource \u0026#34;aws_subnet\u0026#34; \u0026#34;public_b\u0026#34; { vpc_id = aws_vpc.sample_vpc.id ipv6_cidr_block = cidrsubnet(aws_vpc.sample_vpc.ipv6_cidr_block, 8, 1) cidr_block = cidrsubnet(aws_vpc.sample_vpc.cidr_block, 4, 11) availability_zone_id = data.aws_availability_zones.available.zone_ids[1] tags = { \u0026#34;Name\u0026#34; = \u0026#34;Public-Subnet-B\u0026#34; } } Once again, we run our terraform plan and we should get an output similar to this:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: ~ update in-place Terraform will perform the following actions: # aws_subnet.public_a will be updated in-place ~ resource \u0026#34;aws_subnet\u0026#34; \u0026#34;public_a\u0026#34; { id = \u0026#34;subnet-0123456789abcdefg\u0026#34; + ipv6_cidr_block = \u0026#34;xxxx:yyyy:zzzz:nnn1::/64\u0026#34; tags = { \u0026#34;Name\u0026#34; = \u0026#34;Public-Subnet-A\u0026#34; } # (15 unchanged attributes hidden) } # aws_subnet.public_b will be updated in-place ~ resource \u0026#34;aws_subnet\u0026#34; \u0026#34;public_b\u0026#34; { id = \u0026#34;subnet-gfedcba9876543210\u0026#34; + ipv6_cidr_block = \u0026#34;xxxx:yyyy:zzzz:nnn2::/64\u0026#34; tags = { \u0026#34;Name\u0026#34; = \u0026#34;Public-Subnet-B\u0026#34; } # (15 unchanged attributes hidden) } Plan: 0 to add, 2 to change, 0 to destroy. The plan shows the addition of the new CIDR block to each subnet, noting the network number changing slightly to accommodate the size we requested. Once applied, we should see these details in the console:\nIPv6 CIDR on one of the two sample subnets Perfect, now let\u0026rsquo;s launch an EC2 instance and then try and connect to something using the IPv6 address!\nThe EC2 instance can\u0026#39;t connect to an IPv6 address Well, almost there - we re missing the routing.\nAdding IPv6 Routing to the public subnets Having a look at our existing route tables, we can see that AWS added in the route for the VPC IPv6 CIDR to target the local VPC in the route table, which means it can find resources in the VPC that have the IPv6 address that is within the CIDR. Great for local traffic, but we need to be able to access the outside world using IPv6!\n⚠️ Note: While traffic can still route with IPv4, we need to enable routing for IPv6, otherwise any traffic inbound to the server won\u0026rsquo;t be able to send traffic back. ⚠️\nThe sample Public-Route-Table doesn\u0026#39;t have any IPv6 route to the internet We can also see this in the sample code too, in the vpc_public_routing.tf file (https://github.com/mystcb/ipv6-on-aws/blob/main/01-sample-vpc/vpc_public_routing.tf)\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # Primary Sample Route Table (Public) resource \u0026#34;aws_route_table\u0026#34; \u0026#34;public_rt\u0026#34; { vpc_id = aws_vpc.sample_vpc.id tags = { \u0026#34;Name\u0026#34; = \u0026#34;Public-Route-Table\u0026#34; } } # Route entry specifically to allow access to the Internet Gateway resource \u0026#34;aws_route\u0026#34; \u0026#34;public2igw\u0026#34; { route_table_id = aws_route_table.public_rt.id destination_cidr_block = \u0026#34;0.0.0.0/0\u0026#34; gateway_id = aws_internet_gateway.sample_igw.id } Here we can see the Public-Route-Table resources, but it only shows traffic to the Internet Gateway (IGW) for IPv4 traffic. We will need to add a route to allow IPv6 traffic to hit the IGW.\nWe can do this by creating a new aws_route resource, but we need to use the destination_ipv6_cidr_block parameter instead.\nFor the destination though, with IPv4 you could use the block 0.0.0.0/0 to mean \u0026ldquo;all traffic\u0026rdquo;. If we were to type the IPv6 version out in full, it would look like 0000:0000:0000:0000:0000:0000:0000:0000/0. Bit of a pain! Thankfully we can apply the short hand rule mentioned at the start, remove all the 0\u0026rsquo;s, shrink down, and you get quite simply ::/0, which suddenly seems much shorter than the IPv4 version! With this we can add this as the destination IPv6 block.\n1 2 3 4 5 resource \u0026#34;aws_route\u0026#34; \u0026#34;public2igw_ipv6\u0026#34; { route_table_id = aws_route_table.public_rt.id destination_ipv6_cidr_block = \u0026#34;::/0\u0026#34; gateway_id = aws_internet_gateway.sample_igw.id } Once again, lets run the terraform plan and we should get something like this:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # aws_route.public2igw_ipv6 will be created + resource \u0026#34;aws_route\u0026#34; \u0026#34;public2igw_ipv6\u0026#34; { + destination_ipv6_cidr_block = \u0026#34;::/0\u0026#34; + gateway_id = \u0026#34;igw-0123456789abcdefg\u0026#34; + id = (known after apply) + instance_id = (known after apply) + instance_owner_id = (known after apply) + network_interface_id = (known after apply) + origin = (known after apply) + route_table_id = \u0026#34;rtb-0123456789abcdefg\u0026#34; + state = (known after apply) } Plan: 1 to add, 0 to change, 0 to destroy. And once applied, we should be able to see the new route in the routing table!\n::/0 has been added to the route table, pointing to the Internet Gateway (IGW) Jumping back on our EC2 instance, and we get a connection!\nWe are now able to connect using IPv6 to the outside world! ⚠️ Attention: Inbound traffic to an EC2 instance, for example, will still be protected by its Security Group. The rules in a security group are specific to the IP protocol, so an allow for an IPv4 inbound rule, will only allow that. Make sure that you add any additional IPv6 rules to the security group to permit inbound access to resources in the Public Subnet ⚠️\nAdding IPv6 outbound routing to the private subnets We are getting to the last part of this post about enabling IPv6 on AWS, and we do need to cover the private subnets to complete the sample network configuration. This part will come with a few warnings, but if you are keeping in line with the AWS Well-Architected Framework, this will not be an issue at all!\nThinking back to what we mentioned before, AWS will use global unicast addresses for resources and services in AWS. Meaning, that you do not have a \u0026ldquo;private\u0026rdquo; CIDR for your private subnets. As you know, in IPv4 there is the RFP1918 that defines a number of IP blocks, specifically for \u0026ldquo;internal connectivity\u0026rdquo;. These IP ranges are not routable on the public internet. This allowed the expansion of devices within a private network, without using up the public internet space. As IPv6 can assign every device on the planet, it makes it easier for us to ensure that the address assigned to a node is unique.\nNext we have to look at the definition of what AWS considers a \u0026ldquo;public\u0026rdquo; subnet and a \u0026ldquo;private\u0026rdquo; subnet. A \u0026ldquo;public\u0026rdquo; subnet, is considered such, whereby the route table for the subnet points its outbound internet traffic directly to an Internet Gateway (IGW), and the IGW allows external traffic to route to the subnet. For a \u0026ldquo;private\u0026rdquo; subnet, there is no IGW for it to connect to, and you would use a service such as a NAT Gateway (NGW) to connect through to the internet, and as the NAT Gateway doesn\u0026rsquo;t allow traffic inbound to the network, and the IP address of the node will be a non-internet rotatable IP address, traffic can\u0026rsquo;t reach the service in the VPC.\nThis definition sill applies to Private IPv6 subnets, and this is why AWS created the IPv6 Egress-Only Internet Gateway. Much like the IPv4 Internet Gateway counterpart, the IPv6 Egress-Only Internet Gateway is a horizontally scaled, redundant, and highly available VPC component that allows outbound communication for IPv6 traffic to the internet. As this new IPv6 egress-only internet gateway only permits traffic outbound, and not inbound, this will make a subnet private, as traffic can not reach it.\nAs this is a VPC component, and not a service like the NAT Gateway, you technically only need to deploy one, like the Internet Gateway, and point outbound traffic to the egress-only internet gateway, and it will scale as needed.\nWith this in mind, lets start by adding in the Egress-only Internet Gateway. This is created by the terraform resource aws_egress_only_internet_gateway (https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/egress_only_internet_gateway). This is very similar to the normal Internet Gateway, and even the set up is the same.\nWithin the vpc_private_routing.tf file (https://github.com/mystcb/ipv6-on-aws/blob/main/01-sample-vpc/vpc_private_routing.tf) you will need to add the following resource:\n1 2 3 4 5 6 7 8 # IPv6 Egress-only Internet Gateway resource \u0026#34;aws_egress_only_internet_gateway\u0026#34; \u0026#34;sample_ipv6_egress_igw\u0026#34; { vpc_id = aws_vpc.sample_vpc.id tags = { \u0026#34;Name\u0026#34; = \u0026#34;Sample-VPC-IPv6-Egress-Only-IGW\u0026#34; } } Once again, running terraform plan will output something similar to this:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # aws_egress_only_internet_gateway.sample_ipv6_egress_igw will be created + resource \u0026#34;aws_egress_only_internet_gateway\u0026#34; \u0026#34;sample_ipv6_egress_igw\u0026#34; { + id = (known after apply) + tags = { + \u0026#34;Name\u0026#34; = \u0026#34;Sample-VPC-IPv6-Egress-Only-IGW\u0026#34; } + tags_all = { + \u0026#34;Environment\u0026#34; = \u0026#34;Sandbox\u0026#34; + \u0026#34;Name\u0026#34; = \u0026#34;Sample-VPC-IPv6-Egress-Only-IGW\u0026#34; + \u0026#34;Source\u0026#34; = \u0026#34;Terrform\u0026#34; } + vpc_id = \u0026#34;vpc-0123456789abcdefg\u0026#34; } Plan: 1 to add, 0 to change, 0 to destroy. At this point, I would like to point out - that all the changes we have made so far, have not increased the cost of running! Much like the Internet Gateway, the only cost you have is the outbound traffic. As the Egress-Only Internet Gateway is the same, there is no cost for running this in a VPC other than the outbound traffic you intend to push through it. Unlike the NAT Gateways, which will cost you about $36.50/month, and then for best practice, you would need one in each availability zone, which then adds up. It does make IPv6 only networking in AWS far cheaper than IPv4!\nApply the changes, and the new Egress-only Internet Gateway will be created.\nAdding IPv6 CIDRs to the private subnets The next step, is that we need to add the CIDR blocks to the private subnets. This is pretty much identical to the public subnet addition, in fact, it is identical. This is due to there being no difference between public blocks and private blocks within the VPC IPv6 range. So for speed, we just repeat the same.\nHead back to our sample file vpc_subnets.tf (https://github.com/mystcb/ipv6-on-aws/blob/main/01-sample-vpc/vpc_subnets.tf), and look for the two \u0026ldquo;private\u0026rdquo; subnets. Add in the ipv6_cidr_block for each of them, based off the next two blocks calculated earlier:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 # Private Subnet A resource \u0026#34;aws_subnet\u0026#34; \u0026#34;private_a\u0026#34; { vpc_id = aws_vpc.sample_vpc.id ipv6_cidr_block = cidrsubnet(aws_vpc.sample_vpc.ipv6_cidr_block, 8, 2) cidr_block = cidrsubnet(aws_vpc.sample_vpc.cidr_block, 4, 12) availability_zone_id = data.aws_availability_zones.available.zone_ids[0] tags = { \u0026#34;Name\u0026#34; = \u0026#34;Private-Subnet-A\u0026#34; } } # Private Subnet B resource \u0026#34;aws_subnet\u0026#34; \u0026#34;private_b\u0026#34; { vpc_id = aws_vpc.sample_vpc.id ipv6_cidr_block = cidrsubnet(aws_vpc.sample_vpc.ipv6_cidr_block, 8, 3) cidr_block = cidrsubnet(aws_vpc.sample_vpc.cidr_block, 4, 13) availability_zone_id = data.aws_availability_zones.available.zone_ids[1] tags = { \u0026#34;Name\u0026#34; = \u0026#34;Private-Subnet-B\u0026#34; } } Run the terraform plan and apply the changes to assign the blocks to the subnets.\nAdding IPv6 Routing to the private subnets This is where the limitations with the IPv4 NAT Gateway makes the changes for a private subnet add additional operational overhead to the work needing to be completed. In our sample code, we created two route tables for the private subnets, one for each availability zone, so that traffic within the subnet routes through the NAT Gateway for that availability zone.\nThis means, rather than like the public subnet, where we need to add just a single route in terraform. We have to add two and attach them to both route tables.\nIf you were creating an IPv6 only VPC, then you could reduce the work and have a single route table that works for both availability zones! However, we will look at this at a later date.\nThis time, we need to head to the vpc_private_routing.tf file (https://github.com/mystcb/ipv6-on-aws/blob/main/01-sample-vpc/vpc_private_routing.tf) again, and we need to add in the two new routes.\nIf you look at the file, you will see that for the IPv4 NAT Gateway, there is a specific parameter that you use to tell terraform to issue the right API command to AWS to add the route, which you can see here:\n1 2 3 4 5 6 # Route for Subnet A to access the NAT Gateway resource \u0026#34;aws_route\u0026#34; \u0026#34;private2natgwa\u0026#34; { route_table_id = aws_route_table.private_rt_a.id destination_cidr_block = \u0026#34;0.0.0.0/0\u0026#34; nat_gateway_id = aws_nat_gateway.sample_natgw_subnet_a.id } The nat_gateway_id will be specifically used to pointing to the ID of the NAT Gateway within the availability zone. Much like in the public subnet you would use gateway_id for the Internet Gateway, you can use the egress_only_gateway_id for the IPv6 traffic.\nTherefore, we will need to add 2 new blocks to the terraform file, to add in the route ::/0 to point to the Egress-only Internet Gateway.\n⚠️ Attention: Remember, for IPv6 routes, you will need to use the destination_ipv6_cidr_block as part of the route table resource ⚠️\n1 2 3 4 5 6 7 8 9 10 11 12 13 # Route for Subnet A to access the Egress-Only Internet Gateway resource \u0026#34;aws_route\u0026#34; \u0026#34;private2ipv6egressigwa\u0026#34; { route_table_id = aws_route_table.private_rt_a.id destination_ipv6_cidr_block = \u0026#34;::/0\u0026#34; egress_only_gateway_id = aws_egress_only_internet_gateway.sample_ipv6_egress_igw.id } # Route for Subnet B to access the Egress-Only Internet Gateway resource \u0026#34;aws_route\u0026#34; \u0026#34;private2ipv6egressigwb\u0026#34; { route_table_id = aws_route_table.private_rt_b.id destination_ipv6_cidr_block = \u0026#34;::/0\u0026#34; egress_only_gateway_id = aws_egress_only_internet_gateway.sample_ipv6_egress_igw.id } For one final time, run the terraform plan and you should see an output similar to this:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # aws_route.private2ipv6egressigwa will be created + resource \u0026#34;aws_route\u0026#34; \u0026#34;private2ipv6egressigwa\u0026#34; { + destination_ipv6_cidr_block = \u0026#34;::/0\u0026#34; + egress_only_gateway_id = \u0026#34;eigw-0123456789abcdefg\u0026#34; + id = (known after apply) + instance_id = (known after apply) + instance_owner_id = (known after apply) + network_interface_id = (known after apply) + origin = (known after apply) + route_table_id = \u0026#34;rtb-0123456789abcdef1\u0026#34; + state = (known after apply) } # aws_route.private2ipv6egressigwb will be created + resource \u0026#34;aws_route\u0026#34; \u0026#34;private2ipv6egressigwb\u0026#34; { + destination_ipv6_cidr_block = \u0026#34;::/0\u0026#34; + egress_only_gateway_id = \u0026#34;eigw-0123456789abcdefg\u0026#34; + id = (known after apply) + instance_id = (known after apply) + instance_owner_id = (known after apply) + network_interface_id = (known after apply) + origin = (known after apply) + route_table_id = \u0026#34;rtb-0123456789abcdef2\u0026#34; + state = (known after apply) } Plan: 2 to add, 0 to change, 0 to destroy. Remember, that we are adding two routes, as we have two route tables to make the change to. Apply the changes, and you should see the new route in the route tables:\nNew route pointing to the new Egress-Only Internet Gateway VPC resource So, jumping on an EC2 instance in the private subnet (you can see on the command line that the instance is in the 192.168.12.0/24 subnet), you can see we have complete outbound access on IPv6!\nPrivate Subnet EC2 instance with connectivity through the Egress-Only Internet Gateway The final outcome So you finally did it, you have enabled IPv6 on your VPC. If all went to plan, you should have something that looks like this:\nSample VPC with IPv6 enabled To make things a little more simpler, you can also check out the 02-vpc-with-ipv6 folder in the sample code (https://github.com/mystcb/ipv6-on-aws/tree/main/02-vpc-with-ipv6), which will produce the same output as the diagram, and if you followed the changes!\nSummary and round up Thank you for getting this far! There is a lot here, but hopefully I have been able to show you how to add IPv6 to an AWS VPC using Terraform. However, this is only the beginning. Throughout my time working with IPv6, there will always be different issues to trip you up, and there are far more features than just a simple VPC!\nIn a future post I hope to show you the following:\nAdding IPv6 to an existing EC2 instance inside a VPC without rebuilding it. A little harder than I expected with Terraform, but I did find a way around it! IPv6 only VPCs! Much cheaper to run that a normal IPv4 VPC, but at what \u0026ldquo;cost\u0026rdquo;, adoption of IPv6 is still quite low, but there are a number of design patterns that mean an IPv6 VPC might work better for some situations. Going into IPv6 in more detail. This is only the very basic information on IPv6, and a lot of what I have mentioned, does come with caveats! I will hope to explain these in more details later on Bits? Why does one number mean something else, and why on earth do we count in bits? - Hope you have learnt binary for this one! Thank you again, and feel free to send me any questions, or help me with any corrections to this post as well! Hopefully, I will see you in a future post!\nP.S. The final cost of me running this for the creation of the post, was mainly the NAT Gateways! Everything else was free! It only cost me $1.50 total! Always make sure you run terraform destroy at the end!\nP.P.S. Happy Birthday to me!\nFurther Reading:\nIPv6 on AWS IPv6 - Wikipedia Sample Code for this Post ","date":"2023-02-11T15:08:20Z","image":"https://static.colinbarker.me.uk/img/blog/2023/02/nasa-Q1p7bh3SHj8-unsplash.jpg","permalink":"https://colinbarker.me.uk/blog/2023-02-11-enabling-ipv6-on-aws-using-terraform/","title":"Enabling IPv6 on AWS using Terraform (Part 1)"},{"content":"So, you have a need to put a firewall appliance in the cloud, you have security pushing you hard to ensure that it has to be this appliance, and it only comes as an AMI to run on EC2. What if there is a very specific security need? Maybe compliance reasons, or even as simple as just ensuring that those that look after the network are able to manage this new network in the cloud. AWS has a solution for you! The AWS Gateway Load Balancer.\nLuck would have it, I was working with a customer that had this very specific requirement as part of their migration into AWS. They had to continue to keep the level of governance on the traffic moving in and out of AWS, while also reducing the amount of time the customer needed to re-train their employees.\nAs much as it would be simple to say, let\u0026rsquo;s spin a GWLB up and start using it, there are some caveats to the install, but hopefully this quick guide will help you through this.\nAWS released the GWLB as part of an additional service that can enable scalability to instances that once were considered the \u0026ldquo;pets\u0026rdquo; of the cloud world. Now it would be possible to use appliances on EC2, that supported the GENEVE protocol, in an auto-scaling, multi-availability zone (Multi-AZ), highly available deployment, removing a lot of the overhead required to look after a number of unique appliances. This half way house between fully cloud native and physical data centre deployments opened up the door to much more secure and acceptable cloud based deployments.\nTo dive into this further we will need to discuss a few concepts, that while sounding new, actually isn\u0026rsquo;t that far off what you would expect from a cloud deployment using load balancers.\nWhat is the GENEVE protocol? In simple terms, it is an encapsulation protocol. I would highly recommend for more information, reading Benjamin Schmaus blog post on What is GENEVE? on the RedHat site.\nWe see encapsulation in every day life, as well in the networking world, we just sometimes miss the comparisons. An example would be a letter sent by good ole \u0026ldquo;snail mail\u0026rdquo;, you usually start with the message inside the envelope, then you put a name, address, postal code, and sometimes the country (or the planet!). Now looking at the order in reverse we can start to see how you have encapsulated your message to then have this be received by the correct person.\nCountry - Gets you to the right local area in planet earth Postal Code - The main part that is needed, in most cases this gets you down to the street level or building level Address - These days, not actually needed with the post code, but within the address, you might have a building number or name Person\u0026rsquo;s Name - Now that it is in the right building, the right person can open the message There we have \u0026ldquo;encapsulation\u0026rdquo; broken down, you can then apply this to the GENEVE protocol.\nThe payload from the client or service is generated Additional headers including the ethernet headers for the original payload are added Additional GENEVE specific headers are added including protocol type and Virtual Network Identifier (VNI) Here is where a new GENEVE payload is generated Further ethernet headers are added to the new payload that includes the new ethernet headers, IP, and packet types For more technical detail see the RFC8926, Section 3\nAt this point the original payload can be moved around without changing the content allowing this to be shipped to the firewall appliance, where the payload can then be broken down, scanned, and the headers put back on, and sent back. Once sent back, the GENEVE specific encapsulation is removed, and the payload can continue along to its destination.\nThis is why the AWS Gateway Load Balancer can be so powerful when dealing with traffic that needs to be evaluated. As the original payload is encapsulated, you could have a fleet of load balanced network appliances filtering the traffic, something not possible with traditional networking. (Big caveat that networking is a large and wonderful subject, whereby you could do this, but for the sake of simplicity, a basic network typically won\u0026rsquo;t have this level of networking!)\nHow does this work in AWS? Just setting up a simple Gateway Load Balancer in AWS isn\u0026rsquo;t the whole set up. Unlike Application Load Balancers working at the Layer 7: Application layer of the OSI model, which when configured correctly, points to it\u0026rsquo;s destination and serves the traffic, or the Network Load Balancer working at the Layer 4: Transport layer, the Gateway Load Balancer takes it one further level up to the Layer 3: Network layer directly. As this is higher up the OSI model, it means that traffic manipulation needs additional features added to enable the payload to reach the correct destinations.\nTo make this work, we need to take a look at two additional features that exist in the AWS world:\nVPC PrivateLink (VPC Endpoints) Much like other services that use VPC Private Link, this enables you to share services within AWS without giving direct network access to the service in question. Using a VPC Endpoint Service, you can connect this up to a compatible service, like a Load Balancer, and share the Endpoint to another place in AWS. Edge Associated VPC Route tables This additional feature is where the \u0026ldquo;magic\u0026rdquo; happens! Building route tables, you would normally attach them to subnets within the VPC to create routes and help network traffic reach the correct destination. With an Edge Associated Route, this route can be attached to the Edge of the VPC, one that a Internet Gateway will recognise and route traffic accordingly. This routing method linked to the VPC Endpoint above, means that you can tell traffic which is bound for the VPC from an external source to route itself to the VPC endpoint of the Gateway Load Balancer, Once the payload hits the AWS Gateway Load Balancer (GWLB), the GENEVE protocol elements are added, and the encapsulated payload is sent to the network appliances. A compatible network appliance completes the work it needs to do on the payload, sends it back through to the GWLB\u0026rsquo;s VPC Endpoint, where the payload continues onto its original destination, as the stripped headers will contain the networking information required as it was prior to the encapsulation.\nTraffic flow for the Gateway Load Balancer The flow goes as follows:\nTraffic destined for the External IP of the Application Load Balancer hits the Internet Gateway, using the Edge Associated Route Table, in the translation between the External IP and the Internal IP, traffic is redirected to the Gateway Load Balancer\u0026rsquo;s VPC Endpoint The payload is then transported over to the VPC Endpoint Service linked to the Gateway Load Balancer From the VPC Endpoint Service, it reaches the Gateway Load Balancer The payload then reaches the Gateway Load Balancer, where the GENEVE headers are added, so that the payload knows to send the data to the Network Appliance running on EC2, and part of the GWLB target group. Once filtered or checked by the network appliance, the payload is then sent back to the Gateway Load Balancer where the packets are un-encapsulated. The payload is then pushed back to the VPC Endpoint Service The payload reaches its original deviation point, at the VPC Endpoint in the VPC where the traffic was originally destined. As the packet is no longer encapsulated, this means that the payload is then able to successfully travel to it\u0026rsquo;s intended destination, the Application Load Balancer. At this point the payload is as normal, for any load balancing, and it will reach the destination specified in the target group for the ALB, an EC2 instance in this case. What\u0026rsquo;s next? With the flow in mind, and the way that the payload is encapsulated through the AWS Gateway Load Balancer, that opens up a number of options that were usually closed off for organisations. To put this into practice, there are still some additional steps that you need to take.\nIn the next part of my blog series, I will show how to take the new service and link it together in a network design that could be used for larger organisations that require a higher level of control over their network, while still being able to use legacy on premise networks!\nI am not one for hiding behind the \u0026ldquo;you will see next time\u0026rdquo;, but what I do have are links to the code samples I will be using! Means you can also get a head start!\nI will also be concentrating on a Firewall Appliance created by Check Point. The reason for this, is that I have been working with a customer using this service and they have an existing AWS Marketplace profile that contains a number of appliances, including a CloudGuard Network Security for the Gateway Load Balancer appliance, which has been specifically modified to work with the GENEVE protocol.\nCode examples are available here for this type of service:\n2-egress-vpc 3-dmz-vpc ","date":"2022-11-16T11:25:00Z","image":"https://static.colinbarker.me.uk/img/blog/2022/02/gwlb_sideways_flow.png","permalink":"https://colinbarker.me.uk/blog/2022-11-16-gwlb/","title":"What is an AWS Gateway Load Balancer anyway? (Part 1)"},{"content":"What is it? During the COVID-19 Pandemic, Tokonatsu Festival needed to run a virtual online only event. One of the many ideas included the creation of a Virtual Portal with an online created Matsuri. This included a virtual version of our Japanese Virtual Wish Tree.\nPhoto by Lennart Jönsson (@lenjons) on Unsplash Tanabata which is also known as the Star Festival, celebrates the meeting of the deities Orihime and Hikoboshi. A popular custom of this festival involves writing wishes on strips of paper. In modern times, these small bits of paper are hung on bamboo, or with Tokonatsu, our Wish Tree shrine.\nThis is where our Serverless Framework solution came from.\nBackground The original code for the Tokonatsu Wish Tree was written by the Co-Vice Chair, Adam Hay. This used:\nNodeJS AWS API Gateway AWS CloudFront AWS Lambda AWS DynamoDB The code for which is available on the Tokonatsu GitHub under the WishTreeApplication.\nServerless Framework https://www.serverless.com/ An open-source framework that helps to easily build serverless applications on services like AWS. You use a combination of YAML and a CLI to control, manage, maintain, and operate your application.\nServerless Configuration File This controlled from the serverless.yaml file within the api folder, and here is a quick breakdown of the file\n1 2 3 4 5 6 7 8 9 10 11 12 service: toko-wishtree-api # Name of the serverless service frameworkVersion: \u0026#34;2\u0026#34; # Framework version to use provider: # Provider Block name: aws # Which cloud provider (AWS) runtime: nodejs12.x # Which Runtime (NodeJS12) lambdaHashingVersion: 20201221 # Used to remove depreciated options stage: ${opt:stage, \u0026#39;dev\u0026#39;} # API Gateway Stage (defaults to dev) region: ${opt:region, \u0026#39;eu-west-2\u0026#39;} # AWS Region (defaults to London) apiGateway: # Specific API Gateway config shouldStartNameWithService: true # Ensure the API GW service starts with this # --snipped-- Here we setup the top level information for the serverless package. Setting up values including the service name, which is useful for identifying your deployment in AWS, and the framework version.\nThe provider section is where we start to define how the Serverless Framework will interact with the destination, and which specific codebase you are running. In this instance, we are using AWS with the NodeJS version of 12.x.\nIAM Permissions To ensure access to the DynamoDB table, we can define permissions in a new role which can be assigned to the Lambda functions. This will mean no specific IAM keys or credentials will be needed within the code.]\n1 2 3 4 5 6 7 8 9 10 11 iam: role: statements: - Effect: \u0026#34;Allow\u0026#34; Action: - \u0026#34;dynamodb:GetItem\u0026#34; - \u0026#34;dynamodb:Scan\u0026#34; - \u0026#34;dynamodb:PutItem\u0026#34; - \u0026#34;dynamodb:UpdateItem\u0026#34; Resource: - arn:aws:dynamodb:*:*:table/${file(./config.${opt:stage, self:provider.stage, \u0026#39;dev\u0026#39;}.json):WISH_TREE_TAGS_TABLE} The Functions Each API endpoint we need to define will need to have a function to go along with it. In this case, we can see the getTagsForWishTree function, that is used by the Javascript on the static site to collect a list of the current tags that it has stored in the DynamoDB table.\n1 2 3 4 5 6 7 8 9 10 11 12 13 functions: getTagsForWishTree: name: wish-tree-${opt:stage, self:provider.stage, \u0026#39;dev\u0026#39;}-get-tags handler: getTagsForWishTree.handler description: Function to get the Wishtree Tags from the DB environment: ENV: ${opt:stage, self:provider.stage, \u0026#39;dev\u0026#39;} WISH_TREE_TAGS_TABLE: ${file(./config.${opt:stage, self:provider.stage, \u0026#39;dev\u0026#39;}.json):WISH_TREE_TAGS_TABLE} events: - http: path: getTagsForWishTree method: get cors: true This will generate two elements. The Lambda function using the NodeJS code with the filename getTagsForWishTree.js and will look for the handler function, as it\u0026rsquo;s entry point. The Lambda function then gets a set of environment variables called ENV and WISH_TREE_TAGS_TABLE which the latter contains the configured DynamoDB table name.\nThe events section defines what API Gateway should setup, in this instance the config asks for a path value of getTagsForWishTree using the HTTP method of GET, while also ensuring that the Cross-Origin Resource Sharing (CORS) headers are set. This ensures that browsers know which resources it should permit to be loaded, and which ones it shouldn\u0026rsquo;t.\nThe Resources This is where we can get a little creative, as the Serverless Framework allows us to include other non-Serverless Framework resource as part of our deployment, beyond the usual Serverless technologies of Lambda and API Gateway.\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 resources: Resources: tagsTable: Type: AWS::DynamoDB::Table Properties: TableName: ${file(./config.${opt:stage, self:provider.stage, \u0026#39;dev\u0026#39;}.json):WISH_TREE_TAGS_TABLE} AttributeDefinitions: - AttributeName: Description AttributeType: S KeySchema: - AttributeName: Description KeyType: HASH ProvisionedThroughput: ReadCapacityUnits: 5 WriteCapacityUnits: 5 SSESpecification: SSEEnabled: true Here we can see the creation of our DynamoDB table with Server Side Encryption being enabled to ensure that the wish tree tags are secured for our use only. A small call out to the AWS Well-Architected Framework\u0026rsquo;s Security Pillar!\nWhat about AWS CloudFront? Typically an AWS API Gateway uses it\u0026rsquo;s own naming system and stage information to generate the API Endpoint, which takes the form of:\n1 https://{restapi-id}.execute-api.{region}.amazonaws.com/{stageName} This isn\u0026rsquo;t very friendly! With the use of Custom Domains we are able to choose our own user-friendly domain and map it to our API endpoint, finishing off the whole package.\nThe Custom Domain settings on API Gateway Final Words The full setup instructions on how to build your own Wishtree application are available on the WishTreeApplication repo on GitHub.\nLead photo by Lennart Jönsson on Unsplash ","date":"2021-03-14T20:12:50Z","image":"https://static.colinbarker.me.uk/img/blog/2021/03/wishtree.jpg","permalink":"https://colinbarker.me.uk/blog/2021-03-14-serverless-tokonatsu-wishtree-app/","title":"Wishtrees and Serverless on AWS"},{"content":"Having been extremely lucky that DevOpsGroup do allow me the flexibility in my work to be able to attend previous re:Invent conferences however, this year was very different. COVID-19 and the worldwide pandemic changed the way we work, interact, and how conferences should run. re:Invent was no different.\nThis year, AWS put on a 2 week online conference, and still were able to include all the bells and whistles that make re:Invent special!\nFor now, here are are my thoughts on some of the announcements made during the event.\nSorry for the bad quality screenshots, unlike a real event where I would remember my camera, I spent most of it trying to remember how to take screenshots! (And somehow made them low quality!)\nAurora Serverless v2 Screenshot of the Aurora Serverless v2 bullet points Quite a significant change to the original Aurora Serverless (now v1) service, introducing additional features which opens the usage of this database engine to even more creative ideas and solutions that might not have been available to Solution Architects before.\nRead Replicas was a sorely missed feature from the v1 service, which did make the choice harder to sell to customers looking for a specific type of disaster recovery solution, or being able to split their database traffic load in a very specific way.\nThis is a preview at the moment, but will be keeping a close eye on this in the near future.\nAWS Blog Post Sign up for Early Access Aurora Serverless v2 Documentation Bablefish for Aurora PostgreSQL Screenshot of the AWS Bablefish for Aurora PostgreSQL announcement Announced as an open source (Apache-2.0) project, Bablefish for Aurora PostgreSQL is a welcome addition to any DBA, Engineer, and Solutions Architect\u0026rsquo;s bag of tricks! As with many projects, the challenge that always trips up those planning on a migration into the cloud is that your build up tech debt of old database technologies and setup does not quite fit nicely into your plan. This opens a door which really needed to be open.\nThis specific version allows the migration easier over to using Amazon Aurora from Microsoft\u0026rsquo;s SQL Server by \u0026ldquo;translating\u0026rdquo; the commands from applications destined for SQL Server over to Amazon Aurora with very little code changes. A nod to The Hitchhiker\u0026rsquo;s Guide to the Galaxy by Douglas Adams, where so many other translation tools also get their name!\nWhile it does understand a large number of the standard use cases for SQL Server, care would need to be taken to ensure that you are able to translate everything. Or at least, a way to migrate some of the more Microsoft proprietary elements of SQL Server.\nOnce again, in preview, and will be keeping an eye out for more. Hopefully a MySQL compatible one for assisting with the move into Aurora?\nBablefish for Aurora PostgreSQL Page Sign up for Early Access Open source home page AWS CloudShell Screenshot of AWS CloudShell running Another welcome addition to the AWS Console, AWS CloudShell is just the missing item that allows operators of AWS to quickly get to a CLI without worrying about what they need to have installed. Other cloud providers had already included a function like this, and the addition of the CloudShell to AWS was great to see.\nHaving a bit of a head start than the people who jumped in when it was announced, I was able to quickly get a console up and running and have a look around.\nLike with other CloudShells, persistent storage of 1GB is useful to ensure you have any non-secure settings readily available, is very handy and does save a few headaches when logging back in for the first time. Pre-installed with the AWS CLI, and running on an Amazon Linux 2 distro, means that a number of other popular tools are available.\nPricing, free. Can\u0026rsquo;t really argue with that!\nIf you can get on, I am sure you will understand that having this tool in your back pocket for any event, will make your life a lot easier!\nAWS CloudShell Page CloudShell Features CloudShell User Guide Conclusion As always, there was a lot to take in, and I only barely scratched the surface. I\u0026rsquo;d recommend you look at some of AWS\u0026rsquo;s Blogs to see all of the other announcements, try and see if there is anything that might change the way you think about solutions on AWS Cloud.\nAWS re:Invent 2020 - Top Announcements Andy Jassy\u0026rsquo;s Keynote Verner Vogel\u0026rsquo;s Keynote ","date":"2020-12-14T10:12:50Z","image":"https://static.colinbarker.me.uk/img/blog/2020/12/reinvent-logo.png","permalink":"https://colinbarker.me.uk/blog/2020-12-14-aws-reinvent-2020-thoughts/","title":"AWS re:Invent 2020 - My thoughts"},{"content":"Seems pretty simple, a CloudFlare front end, with an AWS S3 bucket to hold the content for a static website generated using Hugo, but is it?\nFor me to get to where I am with two websites running in this way, it was a long journey and a lot of experience gained along the way. Here I will go into the details on why we are here today, and over time explain the full journey over a number of posts.\nWhat are the key details? CloudFlare Not that I have a problem with AWS CloudFront, the choice of CloudFlare was a legacy one. Both the Tokonatsu Festival and my personal website have had many different backing technologies over the years. From Drupal to Wordpress, and even at one point a custom built PHP CMS system that I wrote from the ground up back in the days of DDR:UK. All of these have one thing in common; Terrible speeds under load!\nCloudFlare at the time offered a service which my hosting provider at the time did not have, and mainly due to time, I have not had a chance to change this in anyway. It works perfectly well, and it would be a shame to break that!\nAWS S3 Websites A staple for any static based website. Having a large amount of processing power behind a simple website doesn\u0026rsquo;t make sense any more. Looking back at the days before \u0026ldquo;The Cloud\u0026rdquo;, I used a desktop computer, shared hosting on Plesk service, IBM rack-mounts in physical data centres in Central London and even an ex-nuclear bunker. These all do have a place in the modern world however, as time has progressed the use cases for these hosted systems becomes narrower. Cost becomes a primary issue, and for someone running a very simple blog, or a front face to a festival, an AWS S3 bucket can do the job just as well at a fraction of the cost.\nHugo Why did I choose Hugo, a static website generator written in Go? Probably as simple as the CloudFlare decision! Someone mentioned it to me and I stuck with it! Primarily it was a decision made during the transition from Drupal 6 for the Tokonatsu Festival to it\u0026rsquo;s own static version of the site.\nAt the time, Tokonatsu Festival used Drupal as most of the Anime Conventions in the UK used a similar ticking system which had been written in Drupal 5, which I was one of a number of events that attempted to upgrade it to work with Drupal 6. The festival moved away from the normal way of conventions in the style of running and getting closer to other UK festivals in style. Thus we moved to a ticketing system called Pretix, and all of the processing for the website was pushed away, leaving a very heavy CMS system for a bunch of mostly static pages running on AWS ElasticBeanstalk.\nThe transition to Hugo took some time to get right, and to modify the theme from the Drupal site was the hardest part, but we go there in the end. Which bought me to my own site! Why re-invent the wheel when you have everything working!\nWhat\u0026rsquo;s next? Automate everything! My next goal will be to complete the automation of this blog, as it stands I am borrowing the automation used for the Tokonatsu Festival site which uses a self-hosted version of GitLab.\nDescribe in-depth the different parts To prevent this from becoming a massive blog, I plan to describe in more detail the different parts of this, so expect details on:\nAWS S3 Bucket Security, what I am doing, and what you should be doing! CloudFlare, how to set that up to work with AWS S3 static website hosting Automation, how I automated the Tokonatsu website\u0026rsquo;s deployment into AWS S3 I was being serious When I said I hosted on a desktop PC, it really wasn\u0026rsquo;t a joke! My first ever view on \u0026ldquo;hosting\u0026rdquo; and how to serve things on the internet was installing Windows 95 (yes, you have permission to pull me apart on this one!) on a desktop computer, and giving it a publicly routed IP address, and it survived!\nI do have to call out ExNet and Damon for helping me at the start. Looking after my desktop box inside your ISP! That however, will be a story for another day!\nThe original Faereal Server ","date":"2020-07-02T10:30:00Z","image":"https://static.colinbarker.me.uk/img/blog/2020/07/diagram.png","permalink":"https://colinbarker.me.uk/blog/2020-07-02-hugo-aws-hosting/","title":"What's behind this blog?"},{"content":"So I have finally decided to have another go at creating a website. Decided to start using Hugo again as my CMS static site generator, mainly due to cost, but also to speed up the delivery of the site.\nI have used Hugo as part of the static website for Tokonatsu, a Japanese Culture and Camping Festival based in Helow, Bedforshire, UK. It has proven to be pretty reliable, so why not use it for my on website!\n","date":"2020-06-16T10:00:00.01Z","permalink":"https://colinbarker.me.uk/blog/2020-06-16-new-blog/","title":"New Blog!"}]