{"id":1086,"date":"2025-10-21T07:12:04","date_gmt":"2025-10-21T07:12:04","guid":{"rendered":"https:\/\/cloudspert.com\/?p=1086"},"modified":"2025-10-21T07:12:04","modified_gmt":"2025-10-21T07:12:04","slug":"open-infra-europe-summit-2025-what-you-missed","status":"publish","type":"post","link":"https:\/\/cloudspert.com\/?p=1086","title":{"rendered":"Open Infra Europe Summit 2025 \u2014 What You Missed"},"content":{"rendered":"<figure><img decoding=\"async\" alt=\"\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1024\/1*Aan5mbFkeWyt1ahszT9vIw@2x.jpeg\" \/><\/figure>\n<p>As you probably know or not, this weekend, Paris held the Open Infra summit in the legendary Ecole Polytechnique. The event was co-located by Gerrit User Summit and VM Migration Day as well. Our goal from this blog? It is to rewind time, relive the three days, and take you along some of the interesting presentations we attended.<\/p>\n<p><strong>VMware to OpenStack with Ansible OS-Migrate<\/strong><\/p>\n<p>Amid growing concerns over VMware licensing, many organizations are considering OpenStack as an alternative. A key aspect of this transition is migrating virtual machines (VMs) from VMware to OpenStack. During the summit, several solutions were showcased. In this blog, we&rsquo;ll focus on <a href=\"https:\/\/github.com\/os-migrate\/vmware-migration-kit\">os-migrate<\/a>, an Ansible collection that was created to facilitate such a migration.<\/p>\n<p>This collection supports multiple types of migration:<\/p>\n<ul>\n<li>The default migration method uses an nbdkit server with a conversion host (an OpenStack instance hosted in the destination cloud). This approach enables the use of CBT (Change Block Tracking) and allows for near-zero downtime during migration.<\/li>\n<li>The second method leverages virt-v2v bindings with a conversion host. You can either use an existing OpenStack instance as the conversion host or let OS-Migrate automatically deploy one for\u00a0you.<\/li>\n<li>A third option allows you to skip the conversion host entirely and perform the migration directly on a Linux machine. In this case, the converted volume can be uploaded as a Glance image or later used as a Cinder volume. However, this approach is not recommended for large disks or a high number of VMs, as its performance is significantly slower compared to the other\u00a0methods.<\/li>\n<\/ul>\n<p>Let&rsquo;s focus on the first approach, which consists of creating a conversion VM on the destination, attaching a Cinder volume to this VM, and initiating a full copy of the VM\u2019s disk to the attached volume using an nbdkit server. At this time, the CBT ID from the source VMware disk is recorded and written as metadata on the target Cinder\u00a0volume<\/p>\n<figure><img decoding=\"async\" alt=\"\" src=\"https:\/\/cdn-images-1.medium.com\/max\/665\/1*hoY2XtuXYrdXZ34Yks9LpA.png\" \/><figcaption>Image from os-migrate github repository<\/figcaption><\/figure>\n<p>After the initial copy, the tool compares the CBT ID in Cinder metadata with the current ID on the source disk; if they differ, only the changed blocks(delta) are transferred.<\/p>\n<p>Once the delta transfer is complete, the conversion host launches a disk format conversion to a format that is acceptable by KVM, the VM is instantiated in the destination OpenStack environment, and started, resulting in minimal downtime during the migration.<\/p>\n<figure><img decoding=\"async\" alt=\"\" src=\"https:\/\/cdn-images-1.medium.com\/max\/665\/1*sRXTK6Km7xz7W8FtsV-4ag.png\" \/><figcaption>Image from os-migrate github repository<\/figcaption><\/figure>\n<p><strong>OVN traffic flow &amp; Troubleshooting<\/strong><\/p>\n<p>OVN is steadily gaining traction in both the OpenStack and Kubernetes communities. This trend was evident at the summit, where multiple talks on OVN were held. Each session was packed, with no seats left, highlighting the growing interest in OVN; therefore, the ability to troubleshoot effectively is no longer optional. An important step for troubleshooting is knowing the commands and doing the mapping between OpenStack resources(ports, routers\u2026) to the OVN\u00a0object.<\/p>\n<p>I invite you to check the sheet code <a href=\"https:\/\/gist.github.com\/velp\/55d8a4345e39d9dc04175bc3ec8e2cad\">GitHub<\/a> page with the steps\/commands to map these resources.<\/p>\n<p><strong>Who framed RabbitMQ?<\/strong><\/p>\n<p>Who didn\u2019t suffer with RabbitMQ while managing an OpenStack cluster, getting messages like message with id xxx timeout, losing a queue when a node is down, is a routine in the career of an OpenStack administrator.<br \/>Here are some of the advices that were shared during the presentation:<\/p>\n<ul>\n<li>Upgrade to version 4.1 of\u00a0RabbitMQ<\/li>\n<\/ul>\n<p>This version is shipped with better throughput, parallelism, and less CPU utilization for Quorum queues.<br \/>Classic queue mirroring (HA classic mirrored queues) was removed in 4.0. You need to migrate to Quorum queues before upgrading. They are supported as version 3.8 of RabbitMQ.<br \/>To activate it, add this section to oslo.messging section of your\u00a0services<\/p>\n<pre>[os.messaging]<br \/>rabbit_quorum_queue = True<br \/>rabbit_transient_quorum_queue = True<\/pre>\n<ul>\n<li>Avoid missed Heartbeats<\/li>\n<\/ul>\n<p>I guess we all saw the repeated messages in RabbitMQ logs for closed connections. This was caused by mutiple issue that was fixed in the pyamqp which was not respecting the timeout, so make sure you\u2019re using the latest version. Another fix would be to change all services using Apache\u2019s Multi-Processing Module (MPM) from worker to\u00a0event.<\/p>\n<ul>\n<li>Avoid Queue\u00a0churn<\/li>\n<\/ul>\n<p>For you who are not famillaire with Queue churn. Queue churn refers to the rapid creation and deletion of queues in RabbitMQ, this is the case with transient queues like reply and fanout\u00a0queues.<\/p>\n<p>Reply queues: Are temporary queues per RPC call, as the name implies when a service like nova-api make a request to nova-compute that needs a reply, it will create a queue for this response only once the RPC call is finished, the queue is\u00a0deleted.<\/p>\n<p>Fanout queues: Fanout are more like broadcast queues message is delivered to all suscripers without any\u00a0filter.<\/p>\n<p>To fix this issue use the configuration below<\/p>\n<pre>[os.messaging]<br \/>use_queue_manager = True<br \/>hostname = controller-01 #Put the name of you're host <br \/>processname = neutron<\/pre>\n<p>Enabling <strong>use_queue_manager<\/strong> will force Oslo Messaging to use consistent queue names based on hostname and process name instead of random UUIDs. This lets services reuse the same queues after restarts, reducing RabbitMQ overhead and startup time. It also simplifies debugging RabbitMQ as queues are identified by host and service\u00a0name.<\/p>\n<ul>\n<li>Streams<\/li>\n<\/ul>\n<p>Like quorum queues, stream queues were introduced in RabbitMQ 3.8. They work similarly to Kafka topics. Instead of creating and managing many transient, random queues (one per consumer or service instance).<\/p>\n<figure><img decoding=\"async\" alt=\"\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1024\/0*94vwzYf5jp2DJbsE.png\" \/><figcaption><a href=\"https:\/\/www.cloudamqp.com\/blog\/rabbitmq-streams-and-replay-features-part-1-when-to-use-rabbitmq-streams.html\">https:\/\/www.cloudamqp.com\/blog\/rabbitmq-streams-and-replay-features-part-1-when-to-use-rabbitmq-streams.html<\/a><\/figcaption><\/figure>\n<p>With streams all messages are written to a single append-only log that is persisted to disk. This allows services to replay messages if a consumer was down when the messages were originally published.<\/p>\n<figure><img decoding=\"async\" alt=\"\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1024\/0*0LCMb5HDoyIDB7Ag.png\" \/><figcaption><a href=\"https:\/\/www.cloudamqp.com\/blog\/rabbitmq-streams-and-replay-features-part-1-when-to-use-rabbitmq-streams.html\">https:\/\/www.cloudamqp.com\/blog\/rabbitmq-streams-and-replay-features-part-1-when-to-use-rabbitmq-streams.html<\/a><\/figcaption><\/figure>\n<pre>[os.messaging]<br \/>rabbit_stream_fanout = True<\/pre>\n<p>Since messages are written to disk, make sure to configure RabbitMQ policy to delete old messages to prevent the disk from filling\u00a0up.<\/p>\n<pre>rabbitmqctl set_policy stream-policy \".*_fanout.*\" <br \/>  '{\"max-length-bytes\":15000000, \"stream-max-segment-size-bytes\":5000000}' <br \/>  --apply-to streams<\/pre>\n<p><strong>Beyond Overcommit: Monitoring-Aware OpenStack Nova Scheduling<\/strong><\/p>\n<p>To place a VM on a hypervisor, the nova-scheduler uses a set of filters and weights. With filters, it eliminates unsuitable hypervisors. For example, the AggregateInstanceExtraSpecsFilter ensures that if a VM is created with a specific flavor matching an <a href=\"https:\/\/docs.openstack.org\/nova\/latest\/admin\/aggregates.html\">aggregate<\/a>, only the hypervisors in that aggregate are returned.<\/p>\n<figure><img decoding=\"async\" alt=\"\" src=\"https:\/\/cdn-images-1.medium.com\/max\/550\/0*G_oqYZfFZjbRL1YC.png\" \/><figcaption><a href=\"https:\/\/docs.openstack.org\/nova\/latest\/admin\/scheduling.html\">https:\/\/docs.openstack.org\/nova\/latest\/admin\/scheduling.html<\/a><\/figcaption><\/figure>\n<p>Weights, on the other hand, determine the most suitable host based on the requested specifications, such as RAM, CPU, disk space, the number of VMs already on the hypervisor, and other\u00a0factors.<\/p>\n<figure><img decoding=\"async\" alt=\"\" src=\"https:\/\/cdn-images-1.medium.com\/max\/500\/0*ovxwShx-LsWVoii1.png\" \/><figcaption><a href=\"https:\/\/docs.openstack.org\/nova\/latest\/admin\/scheduling.html\">https:\/\/docs.openstack.org\/nova\/latest\/admin\/scheduling.html<\/a><\/figcaption><\/figure>\n<p>Until now, everything seems perfect. This is already the default behavior in OpenStack. But what\u2019s new? Before we move to that, are you familiar with overcommit?<\/p>\n<p>Overcommit is the practice of allocating more resources than physically exist on the hypervisor. In most cases, requesting 4 CPUs and 8 GB of RAM for a VM doesn\u2019t mean these resources will be fully used, or used simultaneously with other VMs on the same hypervisor. To avoid wasting resources, we can allocate more VMs than the physical capacity.<\/p>\n<p>In Nova, this practice can be controlled using:<\/p>\n<ul>\n<li>cpu_allocation_ratio: determines how much CPU can be overcommitted.<\/li>\n<li>ram_allocation_ratio: determines how much RAM can be overcommitted.<\/li>\n<\/ul>\n<p>If you\u2019re not seeing the issue with this, let me explain. When running a large public cloud, we can\u2019t know whether a VM will be fully utilized or only partially used. This uncertainty can lead to hypervisors with active VMs becoming overloaded, since we are overcommitting resources. Additionally, the allocation ratios we configure (cpu_allocation_ratio and ram_allocation_ratio) are static and do not adjust dynamically based on the current state of the hypervisor.<\/p>\n<p>To address this, we need a specific filter or plugin that can, for example, check the average CPU usage over the last 24 hours. Based on this information, we can decide whether to include or exclude the hypervisor. This will allow us to schedule tasks based on the actual state of the hypervisors.<\/p>\n<figure><img decoding=\"async\" alt=\"\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1024\/1*lKO9-EIgDhg0eKpib5Ggew.png\" \/><\/figure>\n<p>You can start building you\u2019re own filter base on this <a href=\"https:\/\/github.com\/ovh\/nova-ext-sched\">Github<\/a> repository.<\/p>\n<p>I\u2019ve only selected a few talks from three days of the summit. I hope you found these presentations interesting as i\u00a0did.<\/p>\n<p>See you in another\u00a0blog!<\/p>\n<p>Beyyy\u00a0!<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/medium.com\/_\/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=df7c67b18fff\" width=\"1\" height=\"1\" alt=\"\" \/><\/p>","protected":false},"excerpt":{"rendered":"<p>As you probably know or not, this weekend, Paris held the Open Infra summit in the legendary Ecole Polytechnique. The event was co-located by Gerrit User Summit and VM Migration Day as well. Our goal from this blog? It is to rewind time, relive the three days, and take you along some of the interesting presentations we attended. VMware to OpenStack with Ansible OS-Migrate Amid growing concerns over VMware licensing, many organizations are considering OpenStack as an alternative. A key aspect of this transition is migrating virtual machines (VMs) from VMware to OpenStack. During the summit, several solutions were showcased. In this blog, we&rsquo;ll focus on os-migrate, an Ansible collection that was created to facilitate such a migration. This collection supports multiple types of migration: The default migration method uses an nbdkit server with a conversion host (an OpenStack instance hosted in the destination cloud). This approach enables the use of CBT (Change Block Tracking) and allows for near-zero downtime during migration. The second method leverages virt-v2v bindings with a conversion host. You can either use an existing OpenStack instance as the conversion host or let OS-Migrate automatically deploy one for\u00a0you. A third option allows you to skip the conversion host entirely and perform the migration directly on a Linux machine. In this case, the converted volume can be uploaded as a Glance image or later used as a Cinder volume. However, this approach is not recommended for large disks or a high number of VMs, as its performance is significantly slower compared to the other\u00a0methods. Let&rsquo;s focus on the first approach, which consists of creating a conversion VM on the destination, attaching a Cinder volume to this VM, and initiating a full copy of the VM\u2019s disk to the attached volume using an nbdkit server. At this time, the CBT ID from the source VMware disk is recorded and written as metadata on the target Cinder\u00a0volume Image from os-migrate github repository After the initial copy, the tool compares the CBT ID in Cinder metadata with the current ID on the source disk; if they differ, only the changed blocks(delta) are transferred. Once the delta transfer is complete, the conversion host launches a disk format conversion to a format that is acceptable by KVM, the VM is instantiated in the destination OpenStack environment, and started, resulting in minimal downtime during the migration. Image from os-migrate github repository OVN traffic flow &amp; Troubleshooting OVN is steadily gaining traction in both the OpenStack and Kubernetes communities. This trend was evident at the summit, where multiple talks on OVN were held. Each session was packed, with no seats left, highlighting the growing interest in OVN; therefore, the ability to troubleshoot effectively is no longer optional. An important step for troubleshooting is knowing the commands and doing the mapping between OpenStack resources(ports, routers\u2026) to the OVN\u00a0object. I invite you to check the sheet code GitHub page with the steps\/commands to map these resources. Who framed RabbitMQ? Who didn\u2019t suffer with RabbitMQ while managing an OpenStack cluster, getting messages like message with id xxx timeout, losing a queue when a node is down, is a routine in the career of an OpenStack administrator.Here are some of the advices that were shared during the presentation: Upgrade to version 4.1 of\u00a0RabbitMQ This version is shipped with better throughput, parallelism, and less CPU utilization for Quorum queues.Classic queue mirroring (HA classic mirrored queues) was removed in 4.0. You need to migrate to Quorum queues before upgrading. They are supported as version 3.8 of RabbitMQ.To activate it, add this section to oslo.messging section of your\u00a0services [os.messaging]rabbit_quorum_queue = Truerabbit_transient_quorum_queue = True Avoid missed Heartbeats I guess we all saw the repeated messages in RabbitMQ logs for closed connections. This was caused by mutiple issue that was fixed in the pyamqp which was not respecting the timeout, so make sure you\u2019re using the latest version. Another fix would be to change all services using Apache\u2019s Multi-Processing Module (MPM) from worker to\u00a0event. Avoid Queue\u00a0churn For you who are not famillaire with Queue churn. Queue churn refers to the rapid creation and deletion of queues in RabbitMQ, this is the case with transient queues like reply and fanout\u00a0queues. Reply queues: Are temporary queues per RPC call, as the name implies when a service like nova-api make a request to nova-compute that needs a reply, it will create a queue for this response only once the RPC call is finished, the queue is\u00a0deleted. Fanout queues: Fanout are more like broadcast queues message is delivered to all suscripers without any\u00a0filter. To fix this issue use the configuration below [os.messaging]use_queue_manager = Truehostname = controller-01 #Put the name of you&rsquo;re host processname = neutron Enabling use_queue_manager will force Oslo Messaging to use consistent queue names based on hostname and process name instead of random UUIDs. This lets services reuse the same queues after restarts, reducing RabbitMQ overhead and startup time. It also simplifies debugging RabbitMQ as queues are identified by host and service\u00a0name. Streams Like quorum queues, stream queues were introduced in RabbitMQ 3.8. They work similarly to Kafka topics. Instead of creating and managing many transient, random queues (one per consumer or service instance). https:\/\/www.cloudamqp.com\/blog\/rabbitmq-streams-and-replay-features-part-1-when-to-use-rabbitmq-streams.html With streams all messages are written to a single append-only log that is persisted to disk. This allows services to replay messages if a consumer was down when the messages were originally published. https:\/\/www.cloudamqp.com\/blog\/rabbitmq-streams-and-replay-features-part-1-when-to-use-rabbitmq-streams.html [os.messaging]rabbit_stream_fanout = True Since messages are written to disk, make sure to configure RabbitMQ policy to delete old messages to prevent the disk from filling\u00a0up. rabbitmqctl set_policy stream-policy \u00ab\u00a0.*_fanout.*\u00a0\u00bb &lsquo;{\u00ab\u00a0max-length-bytes\u00a0\u00bb:15000000, \u00ab\u00a0stream-max-segment-size-bytes\u00a0\u00bb:5000000}&rsquo; &#8211;apply-to streams Beyond Overcommit: Monitoring-Aware OpenStack Nova Scheduling To place a VM on a hypervisor, the nova-scheduler uses a set of filters and weights. With filters, it eliminates unsuitable hypervisors. For example, the AggregateInstanceExtraSpecsFilter ensures that if a VM is created with a specific flavor matching an aggregate, only the hypervisors in that aggregate are returned. https:\/\/docs.openstack.org\/nova\/latest\/admin\/scheduling.html Weights, on the other hand, determine the most suitable host based on the requested specifications, such as RAM,<\/p>\n","protected":false},"author":3,"featured_media":1087,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1086","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","entry","has-media"],"jetpack_featured_media_url":"https:\/\/cloudspert.com\/wp-content\/uploads\/2025\/10\/1Aan5mbFkeWyt1ahszT9vIw@2x-xQrlIS.jpg","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/cloudspert.com\/index.php?rest_route=\/wp\/v2\/posts\/1086","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cloudspert.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cloudspert.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cloudspert.com\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/cloudspert.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1086"}],"version-history":[{"count":0,"href":"https:\/\/cloudspert.com\/index.php?rest_route=\/wp\/v2\/posts\/1086\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cloudspert.com\/index.php?rest_route=\/wp\/v2\/media\/1087"}],"wp:attachment":[{"href":"https:\/\/cloudspert.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1086"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cloudspert.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1086"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cloudspert.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1086"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}