Asking the Right Questions – Building AI Tools one question at a time

ask the right questions

In the movie I, Robot, Will Smith’s character Detective Spooner was talking to Dr. Lanning’s pre-recorded holographic message. It would say “I’m sorry! My responses are limited. You must ask the right questions,” and later, when Spooner asks about revolution, Lanning says, “That, Detective, is the right question.”

Using AI tools with the right question can be revolutionary for you and those who you serve.

With the tools today, such as OpenAI ChatGPT, Anthropic’s Claude, Google’s Gemini, and other AI tools, you can tell it what you want and it will give you an answer. It may or may not be what you want. When you’re not sure of the details—which is most of us—I found what was most useful was to ask the tool the right question. This is like the rubber duck debugging technique: asking a rubber duck a question, except in this case, it answers back.

The Problem: Troubleshooting System Logs

For example, as an IT professional, I’m often tasked with troubleshooting an issue on computer systems. Some application breaks, memory leak, newly discovered bug, network connection issues… who knows?

One of the things I found, as many of us know, is logs often have clues on what went wrong. One of the questions I asked myself was “What if I could ask the logs what is wrong? Can someone other than me interact with a log? How would I do it?” Those are the right questions. My search for developing an agentic tool began with how can I use AI to develop a tool that could read a log and troubleshoot the system with it.

I asked ChatGPT how can I write an agent that reads a log or logs, develop a series of hypotheses of what may be wrong or indicated by a log, and then come up with possible solutions based on the hypotheses.

A hypothesis requires testing, and ideally a well-formed hypothesis involves having background knowledge and understanding of the problem. The AI log analyzer, which I have a link to on my GitHub repository, began with the question “Can I create a tool using AI that would analyze my logs and come up with best guess or hypothesis for root cause of a system problem?”

Overcoming Limitations: The Context Window Challenge

Initially I thought about just one case of just analyzing a whole log, but as I started developing and testing this code, I found that AI had certain limitations. For instance, logs can be massively large and it would be often too much information for an AI LLM to handle. I asked another question: “How do I make processing of a large log more manageable?” and “How do I deal with the context window limitation?”

A context window would be the working memory that AI has for answering your question and giving an answer for its task. I came up with the help of the AI two different approaches:

The first approach: find models that have larger context windows. Frontier models like Claude and Gemini have very large context windows. Claude for instance has a 200,000 token context window (a token is approximately 4 characters) which is the size of a Novel (for size comparison, here is an Article which shows relative sizes by token count.. I included different model LLMs that had larger context windows into the application to address this.)

The second approach: create smaller chunks of the logs that fit within the limit of the context window of the LLM so when the AI is dealing with a large file, you can either chunk the file into smaller pieces using commands for managing text like grep or awk, or use the configuration of the application to set the chunk sizes to something more manageable for the AI LLMs.

These solutions allow the application to handle very large log files and give you the answers you need.

Expanding Capabilities: From Debugging to Security

Also from my experience with cybersecurity this question came to mind: “What if the problem with the system was not simply a software bug or user configuration error, but the system was being hacked?” I started to ask a question: “Can I change the log analyzer to act as a security tool or security vulnerability scanning tool?”

The answer came out to be “Why not change the prompt that the AI agent uses? Instead of looking for the root cause related to common bugs, look for something caused by common hacking attempts or security exploits?” Just changing the prompt created another tool.

Now I created two tools! A system troubleshooting/root cause tool and security vulnerability troubleshooting tool. Asking the right questions gives you solutions that your original assumptions or what you think you know—your hypothesis—might have not led you down.

Connecting to Established Methodologies

This ties back to Agile and DevOps core principles in which development and refinement of code, infrastructure and solutions begins: asking the right questions in an iterative manner to the answers you get.

In Lean manufacturing or the Toyota production system, as it was originally called, asking the five “whys”. i.e. Why did this happen? Why did this cause this? The “why” questions help you get down to the root cause. In the same way, you can use tools like ChatGPT to ask these questions to help you develop the solutions.

The Question for You

So the question I would pose to you is: “Are you asking the right questions?” Are you asking the questions about the product you’re developing or the service you’re offering rather than telling it what you want? Are you asking questions about what your users’ needs are and what is the nature of your job and what tools you needed to develop as a result?

The Result: AI Log Analyzer

As the result of this iterative process, I have developed AI Log Analyzer which can read logs and develop multiple hypotheses of possible root causes, act as a security analysis tool, have a REPL or Chat mode which you can ask questions about the log analysis and possible solutions. There are more things planned such as adding RAG (Retrieval Augmented Generation), MCP (Model Context Protocol) tools, and a few other things (Grafana, ServiceNow, etc.) as time and participation allows. This is the result of asking the right question to problems I often face.

Conclusion

I welcome feedback that you may have and encourage you to be curious about what you’re doing and how it can affect you and those around you.

Practice asking the right question by asking questions. You may be surprised where it may lead you.


For more information about the AI Log Analyzer, visit my GitHub repository.

retrospect 2024

Retrospect 2024 – Don’t be the “smartest man in the room”

Retrospect – “a review of or meditation on past events”
A few things come to mind when I reflect on 2024 and think about the past year’s lessons.

1 – Don’t be the “smartest man in the room.”  The phrase “the smartest man in the room” came from Richard Holbrooke, from a 1975 article, meaning the most intelligent person does not guarantee being correct or wise.
Being in the technical profession, we tend to plan, plan and plan.  Look at every angle possible and do all our research all by ourselves.  Then, execute the plan.  Otherwise known as “Waterfall,“.
Whoever comes up with the plan is “the smartest man in the room.”  As implied by the article of the name, it doesn’t go very well for that man with the plan.
The smartest man becomes the bottleneck – since he has the plan, all the eyes look to him for the answers to what isn’t quite clear in the plan.
The smartest man doesn’t know the future – since he has the plan, he makes educated guesses on what the requirements may be.  He’s not Nostradamus, and the guesses are often wrong, which can cause the plan to fail.
The smartest man is under a lot of stress – when the plan starts going sideways, or even if it doesn’t, you can’t ultimately control the outcome.
2 – “Bring it to the team” – In the book “Coaching Agile Teams,” Lyssa Adkins often advises that you bring anything involved with the plan or affects the group to the team.  This is the basic concept of Agile and the Scrum Framework. This is the opposite of being “the smartest man in the room,” which is:
There is no bottleneck with a team – As the team of people is self-managing, everyone is in the loop.  They know what the tasks are, the big picture, with a high degree of trust.
The team doesn’t know the future but can adapt and change – instead of making guesses about everything up front into the future, you plan and make a short time-boxed sprint to minimize the risk of a bad assumption and get feedback from stakeholders to make sure you’re going the right direction with the projects.
The team shares in both the risks and rewards – As a group, you minimize the group and the stress, and also, as a group, you can do more collectively than individually.  Almost everything you use, such as a car, house, bread, and laptop computer, takes a team of people, materials, and time to make.
3 – Practice “personal Agile”Personal Scrum and other Agile tools can be used for yourself.  I have my own Kanban/Scrum board that I use to keep track of my own projects and tasks.  If you apply the concepts from Agile to yourself, it helps with things like overplanning, procrastination, and being “the smartest man in the room.”  Your “team” is the people in your network who can help you with your projects and fill in your knowledge.

What was an “Aha moment” was the Enterprise Technology Leadership Summit (“The DevOps Summit”) that I attended last year, which put a few of these things into perspective.  While I understood and used the tools and techniques for DevOps/SRE at work, you need Agile to get the most out of DevOps and the Agile mindset in your place of work.

Besides the Agile Manifesto, I recommend reading “Scrum, Do twice the work in half the time” by Jeff Sutherland.  It’s one of the most basic frameworks.

These are what I learned from 2024.

Looking forward to 2025 and maybe something useful you can get from this.

Book Review – Atomic Habits

I look at every day as an opportunity to improve myself not just on new years. If you’re into improving systems (Such as using DevOps) then why not improve the most important system you have…Yourself.

James Clear has written in my opinion a clear and practical way of changing your habits by working on a system of habits in his book Atomic Habits. Instead of having a resolutions such as “I will lose 500 pounds by the end of the year” or “I will make $1,000,000 a year” which are for the most part impossible goals since you don’t have the systems of habits needed to not only reach the goal but to support it in the long run.

Without the system, more than likely even if you reach your goal let’s say losing 50 pounds… without the systems of habits, months after you reached your goal you would have not only regained the 50 pounds you lost, you have gained another 25 pounds on top of that. The same thing that happens with the Lotto winners who won a million dollars, they end up spending it all and being worse off than they were in the beginning.

What are the Systems of Habits? Every system is made of smaller individual components or atomic habits. Instead of doing many individual tasks that require lots of conscious effort, why not build a collection of complementary habits. For instance, suppose you want to write a book, you start with a daily habit of writing 50 words a day in the morning. Writing not when you feel inspired but out of habit.

James Clear gives several examples of building habits that help you obtain long term goals. Marginal incremental gains will result in tremendous long term gains. Each habit has a long term cumulative effect on reaching your goals.

An example he gives is the British cycling team. In over a hundred year s they, only one gold medal in the Olympics in cycling and they never won The Tour de France championship. Enter David Brailsford, a specialist in marginal gains. He started changing little things such as the clothing that the riders would wear for wind resistance. Bicycle seats were more comfortable for the riders. Matresses and pillows were more comfortable to give the cyclist a better night’s sleep. They also changed the type of gels they used for muscle recovery. Painted the inside of the vans white so they can see if there was dust in the vans. Hundreds of small improvements. In less than five years, the British cycling team started winning gold medals. They dominated the cycling event in the Olympics in ChinaThree of their riders went on to win the coveted Tour de France multiple times (2012, 2013, 215,2016,2017,2018).

The idea is to work on small consistent habits that you learn to do without thinking and effort that complement other habits that help you achieve your long term goals. Those habits don’t stop after you reach your goal, they continue and allow you to achieve the next goals you have. Pushing you forward without much effort. Just getting started is the hard part.

I have already applied these principles to developing a system of habits of my own.

Here are a few examples of things I’ve done to be a better writer:

1 – Review daily a technical stack such as a framework, language, operating system, etc.

2 – Write at least 50 words a day in any form, such as an article, discovery journal, etc.

3 – Exercise every day at least one pushup.

Each one of these habits complements each other towards the goal of being a better writer. Reviewing the nuances of a framework or language helps me to keep my skills sharp on languages or frameworks I don’t use as often. Writing regularly not when I’m inspired or under a deadline keeps me consistent and disciplined in my writing. Exercise helps me keep fit and helps me manage my energy as I write.

I make it convenient to track my habit with a Goal Tracker app on my Android phone. I use an Anki App for my Android device to review things like IOS and Javascript.

What I notice I started to do after a certain point, that the daily checklist activities became automatic. They became habits I didn’t have to think of tracking at some point. All of these habits aid in reaching my long term goals.

I highly recommend this book not only to learn something about yourself but learn HOW to think strategically about developing habits that help you reach your long term goals.

Using Ansible to provision VMs on AWS

Using Ansible to provision VMs on AWS

I have been asked on several occasions to show how to use Ansible to provision VMs on Amazon Web Services (AWS). This is “commoditization virtualization” on demand just by running a single playbook, which is pretty cool.

Why automation in the first place?

If your reading this article and have any experience configuring A Unix/Linux/Windows server, whether it is a mail server, web server, whatever, you know how time consuming it is to:

  • Partition the disk
  • Create user accounts
  • Install software packages and updates
  • Configure the server application
  • etc…

You had to wait for the packages to install, configure and test the application to make sure it runs and that can take a few hours.

Now, that the servers are virtual and live in a cloud machine somewhere and you now have to configure more than a dozen of them… that’s a lot of time and you have better things to do. Tools like Ansible are the answer to configuring multiple machines.

Spining up VMs using AWS web console

AWS allows you to log into a web console, choose your VM image and bring them up one at a time. It will give you ssh credentials to allow you to login to the VMs you just made and from there, use Ansible to manage and configure them. Would it be nice if you could manage the provisioning of VM instances from Ansible? Yes you can…but there is a bit of work to make it work.

I will describe the way you can do it with several step script and (in another article) programmatically.

How do I know how many VMs I have in inventory?

The challenge of dynamic inventory is the program/playbook does not know what is in the inventory ahead of time. However, if we apply the cattle not pets approach and let Ansible take care of itempotents of the VMs (it won’t clobber the VMs or exceed the constraints of number of VMs that exist) then this can make our lives easier.

Without knowing the inventory, and checking it ahead of time, you are running blind.

Programmatically is the best way to manage and track dynamic inventory and use Ansibles modules to provision VMs in AWS.

Using a playbook to provision VMS.

In my opinion this is a clunky way to use Ansible.

The problem with this way is there is no clean way to see what is the current inventory that is on AWS, you have to run a separate program before running the playbook so you can see what is currently in inventory.

When this is done you end up writing three or four separate scripts to manage this process. In the long run, this becomes difficult to maintain since you have to look at other scripts to understand what is going on.
Writing maintainable code is a key principle.

Build the playbook to provision AWS cloud services.

Playbook is set to local host.

AWS keys are needed for AWS account access.

BOTO Python API libraries are installed.

Just in case you are not aware, BOTO is the API AWS uses for programmatically managing AWS services.

AWS cloud account

Log into AWS Management Console

Under user account, select “security credentials”

In the left hand column, select user

Select the security tab.

Look for security access key.

This is what you will need for boto/ansible.

Running Vagrant

I have created a Vagrant file with an Ansible playbook for managing AWS through a Linux VM created and managed by Vagrant.

First install VirtualBox then install Vagrant.

Download from Github the Vagrant Ansible AWS files.

Change into the vagrant file directory and type:

vagrant up

It will take a while for all the dependencies to be downloaded.

Once vagrant is fully up, type:

vagrant ssh

to access the shell of the vm.

Preparing for instances.

Change directory to the ansible playbook directory and modify the following files:

aws-vars.yml

and add your AWS keys.

Provisioning AWS instances.

from the shell, type:

ansible-playbook AWS-provision.yml

to start provisioning instances in AWS.

You can watch from the AWS console the instances being provisioned.

Terminating instances

from the shell, type:

ansible-playbook AWS-terminate.yml

to terminate the ec2 instances that were provisioned in your account by the provisioning playbook.

You can watch the instances terminated from the AWS console.

This is only the beginning…

With these examples, we just created self contained machines just by running an Ansible playbook. However, we can setup a virtual container network that allows you to place in a private network such items as private networks where you have access to file servers database servers an “internal” and “external” network with a “firewall”. and more complex designs.

I may cover these examples in future articles.

In the meantime, have a great day.

The Triangle of Value

It seems like everyone wants everything these days. They want high quality products and services in the quickest amount of time at the lowest price possible. This is never the case.

I don’t recall people talking about this fundamental concept very often that is The triangle of value. It’s also known by many other names. This is a basic resource constraint when you are offering a product or service.

The triangle of value is this:

You have Time, Cost and Quality – Pick two.

You can have a low cost product very quickly but the quality suffers.

You can have a high quality product relatively quickly but it will be very expensive.

You can have a high quality product at a low cost but it will take lots of time to develop.

This is true of project management and DevOps as well. At the end of the day, these are the three elements you have to work with.

Open source products also work like this. You can have a low cost product (it’s free…so to speak) which is high quality with many features, but it gets done on donated time. It may take some time before a bug is corrected or a new features are added.

Other things that affect the “triangle of value”

Skill and Experience

Skill can reduce the amount of time it takes to make something and it can also affect quality of a product or service. Then again, you will pay more money for an individual with a higher skillset especially if you want to keep them around.

Technology

It can be something that decreases the amount of time you spend on producing a product or service. It can be in the form of automation or another catalytic process. At the end of the day, all advancements of technology are catalysts for getting more stuff out of a process. On the other hand, technology can take time and expertise to develop. This also can cost more money for better tech.

How does this understanding play into DevOps or anywhere else?

In an ideal world, you may be able to hire the brightest minds who are up to the task, have the best equipment, plenty of lead time for getting to market and an unlimited budget…but this is not the case.

It may be more like, you only could hire 2 of the 5 positions for experienced programmers (maybe your one of them) and they aren’t that bright (they just think they are), you have second hand equipment that is a few years old, your budget is a quarter of what you were promised and last week was when you had to get the project done.

That’s ok, these are the reality of the industry. You do with what you can the best you can, the same principles apply.

If you don’t have enough manpower, you make it up with overtime and finding leverage somewhere…maybe automation.

If you don’t have the latest equipment, let’s say it’s slow…you make it up by getting more of the older equipment and use them in parallel using some clever programming and networking.

If you have a short lead time, you loosen your “quality” regiment (as in hope your developers don’t make mistakes writing code by forgoing tests) to save on the time you spend in development.

DevOps CI/CD the “Holy Grail” may be a lofty goal to achieve if your resources are limited, but it is a worthwhile and doable. It may take some time to figure out how to do it and to build and train the resources to do this.

At the end of the day, what you most value is what you will get. You can’t have it all but you can choose what you can live with.

I am open to feedback and any suggestions you may have. Until then, have a good day.